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(57) Abstract: The present invention discloses a novel single nucleotide polymorphism (SNP) in the isolated 5 T tandem repeats of 
the thymidylate synthase (TS) gene and methods for its use. The novel SNP, located in the 12th nucleotide of a 28 bp third tandem 
repeat (3R) of the TS gene, substitutes a C for a G, and is the variant form of the repeat. Subjects with the wild-type form of 3R have 
greater transcription of the TS gene than subjects with the variant form. The invention also reveals that a six base pair deletion in the 3' 
region of TS (-6 bp/1494) indicates mRNA instability and thus reduced production of TS. In diseased tissue, such as cancer, reduced 
production of TS is beneficial because it prevents the cancerous cells from growing and spreading. Analysis of either polymorphism 
or both together allows for prediction of a subject's response to chemotherapeutic and anti-cardiovascular disease treatments because 
both diseases are related to TS levels in a subject. 
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SPECIFICATION 

THYMIDYLATE SYNTHASE POLYMORPHISMS FOR USE IN 
SCREENING FOR CANCER SUSCEPTIBILITY 

CROSS-REFERENCE TO RELATED APPLICATIONS 
[0001] The present application is a continuation-in-part of United States 

Provisional Application No. 60/420,164, entitled "A Novel Single Nucleotide 
Polymorphism in the Tandem Repeats of the Thymidylate Synthase Gene Alters USF- 
1 Binding and Transcriptional Activation," filed October 21, 2002, which is 
incoiporated by reference in its entirety herein. 

TECHNICAL FIELD 
[0002] The present invention relates to the field of medical genetics and disease 

susceptibility screening. Specifically, the present invention relates to the 
identification, prognostic use and therapeutic use of a single nucleotide polymorphism 
in the 5' region of thymidylate synthase (TS) gene. The polymorphism indicates the 
transcriptional activity of the TS gene, and relatedly, the risk of cancer and 
cardiovascular disease. The invention also relates to the prognostic and therapeutic 
use of and screening methods for a six base pair polymorphism found in the 3' 
untranslated region of TS. 

BACKGROUND OF THE INVENTION 
[0003] Thymidylate synthase (TS) is an important enzyme in the nucleotide 

biosynthetic pathway that converts dUMP to dTMP via reductive methylation. The 

TS reaction is the only source of de novo thymidylate in the cell and is thus essential 

1 
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for DNA replication (Friedkin, et al., 1957; Heidelberger et al., 1957; Santi et al., 
1984). The critical role of TS in nucleotide metabolism has made it a common target 
for a variety of chemotherapeutic agents including 5-fluorouracil (5-FU), raltitrexed 
(Tomudex), capecitabine (Xeloda), and pemetrexed (Alimta) (Danenberg, 1977; 
Papamichael, 1999). Inhibition of TS by these agents leads to cytotoxicity induced by 
dTTP pool depletion leading to thymineless death (Houghton, 1999), and in some 
instances uracil misincorporation into DNA (Aherne, 1999; Ladner, 2001), which 
causes irreparable strand breaks through the action of uracil-DNA-glycosylase. 
Limited efficacy of TS inhibitors in the treatment of human cancers has been a 
common phenomenon. Resistance to fluoropyrimidines arises through a variety of 
mechanisms, including increases in TS transcription (Shibata et al. 1998) and 
translation (Kaneda, et al., 1987; Keyomarsi et al., 1993). 

[0004] With respect to the relationship between TS, cardiovascular disease 

(CVD), and other defects, TS and an enzyme called methylenetetrahydrofolate 
reductase (MTHFR) compete for limited supplies of folate required for the 
remethylation of homocysteine (Trinh, 2002). Low plasma folate and high 
homocysteine levels have been independently and collectively correlated with an 
increased risk of CVD. Specifically, elevated plasma homocysteine is a known risk 
factor for occlusive vascular disease, venous thrombosis, neural tube defects and 
pregnancy complications. 

[0005] A polymorphism within the 5 '-untranslated region of the TS gene, 

consisting of tandem repeats of 28 base pairs, has been implicated in modulating TS 

mRNA expression (Kandea, et al., 1987; Horie et al., 1995) and TS mRNA 
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translational efficiency (Kawakami, et al., 2001). Although there have been reports of 
4, 5 and 9 repeats within certain African and Asian populations (Marsh et al., 1999; 
Marsh et al., 2000; Luo et al., 2002), the majority of individual human TS alleles 
harbor either a double repeat (2R) or a triple repeat (3R) for this polymorphism 
creating genotypes of 2R/2R, 2R/3R and 3R/3R. Individuals that are homozygous for 
the 3R were found to have elevated intratumoral TS mRNA (Pullarkat et al., 2001) and 
protein levels compared to 2R homozygotes (Kawakami et al., 1999). 
[0006] In addition, the 5' tandem repeat polymorphism of the TS gene has been 

identified as a predictor of clinical outcome to 5-FU based chemotherapy in both 
adjuvant and metastatic settings (Pullarkat et al., 2001; Villafranca et al., 2001; Marsh 
^et al., 2001; Iacopetta et al., 2001) as well being associated with predicting risk and 
outcome of acute lymphoblastic leukemia (Krajinovic et al., 2002; Skibola et al., 
2002). The tandem repeats have also been shown to predict plasma folate and 
homocysteine levels (Trinh et al., 2002) and the risk of colorectal adenomas (Ulrich et 
al., Cancer Res., 2002). Although screening for the 5' tandem repeats alone has shown 
great promise, the need for more accurate and comprehensive screens is warranted. In 
particular, the identification of novel functional polymorphisms that can be added to 
already useful tests may help enhance the predictive value of the tests. Testing for 
several polymorphisms in conjunction with each other may further increase the 
predictive value of determining an individual's risk for cancer or CVD and also his 
response to known treatments for the disease. 

[0007] USF-1 and USF-2 (upstream stimulatory factors) belong to a family of 

transcriptional regulatory factors bearing helix-loop-helix domains, similar to cMyc, 
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and are found together to a large extent as heterodimers in the cell (Sirito, et al., 1992; 
Viollet et al., 1996). The E-box is a consensus element for the helix-loop-helix USF 
transcriptional activator family of proteins (Singh et al., 1994; Kiermaier et al., 1999; 
Luo et al., 1996; Ferre-D'Amare et al., 1994). The DNA binding activity of USF-1 to 
E-box (CANNTG) consensus sequences is regulated through phosphorylation by 
cdc2/p34 (Cheung et al., 1999) and the stress-responsive p38 kinase (Galibert et al., 
2001). Phosphorylation of USF-1 by these kinases has been shown to activate USF-1 
transcriptional activity under normal and stressful conditions, respectively. 
[0008] Through its DNA binding activity, USF-1 has been shown to 

transactivate a variety of genes including p53 (Reisman et al., 1993) and the 
Adenomatous Polyposis Coli (APC) protein (Jaiswal, et al., 2001). Although both 
USF-1 and USF-2 were thought to be ubiquitously expressed factors, recent evidence 
suggests that USF-1 and USF-2 are differentially regulated in some cancer cells 
(Ismail et al., 1999). The differential regulation may have an effect on the ability of 
USF-l/USF-2 complexes to form and function properly. 

[0009] Another polymorphism within the TS gene, consisting of a 6 bp deletion 

of the sequence TTAAAG at nucleotide 1494 of the TS mRNA ("-6 bp/1494"), has 

been recently discovered through searching the public Expressed Sequence Tag (EST) 

database (Ulrich, 2000). This common polymorphism is also transcribed into the 

3'UTR of the primary TS transcript. Little is currently known about the 3'UTR of TS. 

3'UTRs function as post-transcriptional regulators mainly through control of mRNA 

stability and/or translational efficiency, and are thought to play an important role in the 

overall fate of mRNAs (Grzybowska, 2001). Traditionally, it was thought that the 
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function of 3'UTRs was governed primarily through mRNA secondary structural 
elements, such as stem loop structures. 

[00101 Although this remains true, recent evidence has shown a growing 

number of cz^-binding sequence elements within 3 'UTRs that interact with RNA- 
binding regulatory proteins in a sequence specific manner. In fact, the regulation of 
some well characterized mRNAs have been shown to be dependent, in part, on cis- 
binding sequences within the 3'UTR. For example, the 3'UTRs of COX-2 and p21 WAFI 
mRNAs have been shown to be essential for the proper post-transcriptional regulation 
of these transcripts (Cok, 2001; Giles, 2003). Further, polymorphisms in the 3'UTRs 
of other mRNAs have been shown to have a functional effect on overall gene 
expression. A polymorphism in the 3'UTR of the dihydrofolate reductase mRNA, 
which encodes a critical enzyme that is involved in folate metabolism, plays a 
functional role in governing the post-transcriptional regulation of the mRNA and in 
the overall regulation of gene expression (Goto, 2001). 

[00111 Until this point, the molecular mechanism by which the 5' tandem- 

repeat polymorphism enhances transcription has not yet been elucidated. Further, 
differences in the nucleotide sequences of the repeats have not been considered as 
playing a functional role in transcription and post-transcriptional events. It would be a 
significant improvement in the art to identify the regulatory factor(s) responsible for 
binding within the polymorphic region and enhancing TS mRNA expression. This 
improvement would allow an understanding of why 3R repeats show increased TS 
transcription as compared to 2R. It would then permit diagnostic and therapeutic use 

of the functional difference to identify and treat patients at risk for diseases, such as 
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cancer and CVD, related to the TS pathway, which would result in significantly 
improved and targeted treatments. 

[0012] Additionally, the 3 'UTR of TS mRNA has not been studied up to this 

point as playing a functional role in post-transcriptional regulation. Further, the 
molecular mechanism(s) by which the -6 bp/1494 deletion polymorphism may affect 
the regulation of TS mRNA has yet to be elucidated. Characterizing the regions 
within the TS 3 'UTR that may be responsible for the post-transcriptional regulation of 
TS mRNA, and to identifying the mechanism(s) by which the deletion polymorphism 
affected TS mRNA regulation would be a significant discovery in this area. Revealing 
the effect that the 6 base pair polymorphism has in the 3' UTR of TS RNA provides an 
additional screen for predicting an individual's TS level. It also improves the chances 
of success of targeted clinical therapies, including cancer therapies directed to 
blocking TS creation or function. 
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SUMMARY OF THE INVENTION 
[0013] A first aspect of the invention identifies a novel single-nucleotide 

polymorphism (SNP) within the third tandem repeat that determines the binding and 
transactivating ability of USF complexes and occurs at a high frequency in the tested 
population. The clinical data shows that the screening for the G to C SNP in 
combination with the tandem repeat polymorphism (3RV) significantly increases the 
value of the tandem repeats in predicting response and survival to cancer treatment, 
particularly 5-FU/LV. Individuals with two regular 3R copies have the worst 
response. 3RV copies increase the response to treatment for cancer and/or CVD. 
[0014] In an additional aspect of the invention, USF- 1 and USF-2 are identified 

as factors that bind within the tandem repeat polymorphism of the TS 5' regulatory 
region. 

[0015] A third aspect of the invention shows that USF-1 enhances transcription 

of 2R, 3R and 3RV TS reporter gene constructs in a luciferase assay system and that 
the impact of a 2R or 3R genotype on TS transcriptional activation is ultimately 
related to the presence or absence of the USF binding sites. 

[0016] Yet another aspect of the invention encompasses a diagnostic kit for 

screening for cancer and/or cardiovascular risk by examining the TS SNP 

polymorphism in conjunction with the 5' tandem repeat polymorphism alone, the 3' - 

6 bp/1494 polymorphism alone 5 or using both the TS polymorphisms in conjunction 

with each other. The screen uses the genetic material of an individual to examine 

which polymorphism, or combination of polymorphisms, exists in that individual. The 

diagnostic kit comprises one or more relevant diagnostic primers or probes and/or an 
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allele-specific oligonucleotide primers of the invention. The kit may also comprise 
packaging, vials and tubes, instructions for use, buffer, polymerase, and/or other 
reaction components. 

[0017] In a further aspect, the diagnostic methods of the invention are used to 

predict the chance of an individual developing cancer and/or CVD. Relatedly, 
screening for the polymorphisms, alone or in combination, and can predict the efficacy 
of therapeutic compounds in the treatment of cancer and cardiovascular-related 
diseases via use of high throughput screening (HTS). The HTS rapidly and efficiently 
screens multiple patients for cancer and/or cardiovascular risk. For example, if an 
individual has a lower rate of transcription of TS, that person likely has a lesser chance 
of developing tumors and a better chance of fighting/shrinking the tumors that 
currently exist. 

[0018] A related aspect of the present invention is the pharmacogenetic use of 

the TS SNP and tandem repeats and/or -6 bp/ 1494 polymorphism to identify patients 
most suited to therapy with particular pharmaceutical agents and use of the TS SNP in 
pharmaceutical research to assist the drug selection process. 

[0019] Another object of the invention is to provide a useful target for linkage 

analysis and disease association studies. 

[0020] Yet another object is to develop a novel molecular diagnostic markers 

useful in the detection of CVD and cancer. 

[0021] Another aspect of the invention comprises the use of gene alteration or 

replacement to induce the polymorphisms that produce the desired transcription with 

respect to the TS gene. For example, if reduced activity were desired, the TS gene 
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would be manipulated to include those sequences that result in reduced transcription 
and/or activity. 

[0022] Another aspect of the invention is blocking the production and/or activity of 

the TS enzyme in the target cells. 

BRIEF DESCRIPTION OF THE DRAWINGS 

[0023] Figure 1 is a sequence depiction of a tandem repeat polymorphism 

within the 5 '-untranslated region of the human TS gene. The position of the E-box is 
indicated. 

[0024] Figure 2 is a gel showing that USF proteins bind to the E-box site within 

the tandem repeats of the human TS gene in HT29 nuclear extracts. 
[0025] Figure 3 is a picture of four gels displaying different aspects of USF- 1 

activity. Figure 3A shows phosphorylation of recombinant USF-1 by cdc2/p34 in 
vitro. Figure 3B shows that phosphorylated recombinant USF-1 binds to its 
consensus sequence by EMSA. Figure 3C shows that phospho-USF-1 binds to the TS 
tandem repeats bearing an E-box site. Finally, Figure 3D shows that USF-1 does not 
bind to the variant TS tandem repeats with a G->C base change at the 12 th nucleotide: 
[0026] Figure 4 is the result of a ChIP assay, demonstrating that USF-1 and 

USF-2 bind to the TS 5' UTR in vivo. 

[0027] Figure 5 A is a diagram of the structure of the TS luciferase reporter 

constructs. Figure 5B is a bar chart showing the levels of activation of the TS gene 
promoter by USF-1. 
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[0028] Figure 6A is the Haelll restriction map of the TS tandem repeat 

fragments produced in the RFLP analysis. Figure 6B is a gel showing the results of a 
restriction fragment length polymorphism (RFLP) analysis used for screening of the 
tandem repeats, as well as the G— >C SNP. 

[0029] Figure 7 shows that the 3 'UTR of TS contains no elements of transcript 

instability or translational silencing. Figure 7 A is the structure of the chimeric 
luciferase reporter constructs bearing proximal and distal end deletions of the TS 
3 'UTR. The TS 3 'UTR sequences were inserted between the luciferase coding region 
and poly(A) signal. Transcription was controlled by the SV40 promoter in all 
constructs. The luciferase gene is indicated in the white bars and the TS 3 'UTR 
regions are shown in black bars. The numbers indicate the region of TS 3'UTR that 
was inserted, and numbering begins just after the stop codon of TS.. 
[0030] Figure 7B shows the activity and mRNA levels of the TS 3'UTR 

reporter constructs. Luciferase activity (black bars) was normalized to p- 
Galactosidase activity and is expressed as a percentage of the activity from the empty 
pGL3-control vector. Luciferase mRNA levels (white bars) were normalized to 
internal GAPDH mRNA levels and are expressed as a percentage of the mRNA levels 
of the empty pGL3-control vector. All results are the mean + S.E. for 3 independent 
experiments, each measured in duplicate, a = significantly different (p < 0.005) from 
pGL3-control; b = significantly different (p < 0.05) from pGL3-control. 
[0031] Figure 8 demonstrates that the -6 bp/1494 deletion polymorphism 

causes decreased luciferase activity and message levels compared to +6 bp/1494 
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constructs. Figure 8A shows the structure of the chimeric luciferase reporter constructs 
bearing proximal end deletions of the TS 3'UTR that contain either the +6 bp/1494 
insertion or the -6 bp/1494 deletion polymorphism. The deletion polymorphism lies 
at nucleotide 456 of the TS 3'UTR. The TS 3'UTR sequences were inserted between 
the luciferase coding region and poly(A) signal. Transcription was controlled by the 
SV40 promoter in all constructs. The luciferase gene is indicated in the white bars and 
the TS 3'UTR regions are shown in black bars (gaps indicate the -6 bp/ 1494 deletion 
polymorphism). The numbers indicate the region of TS 3 'UTR that was inserted, and 
numbering begins just after the stop termination codon of TS. The brackets indicate 
the construct counterparts that should be compared and the + and - indicate that the 
constructs contain either the +6 bp/1494 or -6 bp/1494 polymorphism. 
[0032] Figure 8B shows the activity and mRNA levels of the TS 3'UTR 

reporter constructs. Luciferase activity (black bars) was normalized to (3- 
Galactosidase activity and is expressed as a percentage of the activity from the empty 
pGL3 -control vector. Luciferase mRNA levels (white bars) were normalized to 
internal GAPDH mRNA levels and are expressed as a percentage of the mRNA levels 
of the empty pGL3 -control vector. The brackets indicate the construct counterparts 
that should be compared and the + and - indicate that the construct contains either the 
+6 bp/1494 or -6 bp/1494 polymorphism. All results are the mean + S.E. for 3 
independent experiments, each measured in duplicate, a = significantly different (p < 
0.005) from pGL3 -control; b = significantly different (p < 0.05) from pGL3 -control; c 
= significantly different (p < 0.005) from respective +6 bp/1494 counterpart; d = 
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significantly different (p < 0.05) from respective +6 bp/1494 counterpart. Inset shows 
a representative RT-PCR of chimeric message electrophoresed on a 2% agarose gel. 
The internal GAPDH control (510 bp) PCR was run in the same reaction as the 
luciferase (97 bp) PCR. 

[0033] Figure 9 shows that the -6 bp/ 1494 deletion polymorphism causes 

decreased mRNA stability. 293 cells were transfected with reporter gene constructs 
and treated with actinomycin D (final concentration of 10 jxg/ml) 24 hours post- 
transfection. Total RNA was extracted at various time points for 6 hours, reverse 
transcribed and assayed for luciferase and GAPDH levels by semi-quantitative PCR. 
Luciferase mRNA levels were normalized to GAPDH message and are expressed as a 
percentage of the mRNA present at the 0 h time point (100%). All experiments and 

time points are the results of three independent experiments performed in duplicate. 

Asterisks (*) indicate that the message levels of the -6 bp/1494 constructs were 

significantly different (p < 0.05) from their +6 bp/1494 counterparts at the 2, 4 and 6 

hour time points. 

DETAILED DESCRIPTION OF THE INVENTION 
I. OVERVIEW 

[0034] A first aspect of the invention is the discovery of the isolated nucleic 

acid comprising a thymidylate synthase single nucleotide polymorphism (TS SNP), 
and probes and primers therefor. "Isolated" means not naturally occurring. "Isolated 
nucleic acid" means a nucleic acid that is not immediately contiguous with the 5' and 
3' flanking sequences with which it normally is immediately contiguous when present 
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in the naturally occurring genome of the organism from which it is derived. "Isolated 
nucleic acid" may describe a nucleic acid that is incorporated into a vector, 
incorporated into the genome of a heterologous cell, or that exists as a separate 
molecule. The phrase may also describe a recombinant nucleic acid that forms part of 
a hybrid gene encoding additional polypeptide sequences that may be used to produce 
a fusion protein. Thus, the TS SNP isolated nucleic acid may take any of these forms. 
[0035] Further, there is the potential to create and utilize probes to and primers 

for the TS 5' SNP and/or the 3' -6 bp/1494 polymorphism. A probe for the TS SNP 
could be created such that the probe would only bind to the variant form of the TS 
tandem repeats comprising the TS SNP. A probe for the 3 ' -6 bp/1 494 polymorphism 
could be created such that the probe would only bind to the wild type form (+6 
bp/1494) or only bind to the variant form (-6 bp/1494) of the polymorphism. The 
probe may bind to any purified or nonpurified nucleic acid portion may be used if it 
contains the TS gene or more specifically, contains the polymorphic portions of the TS 
gene of interest. The nucleic acid may be single or double stranded DNA or RNA, 
including messenger RNA. 

[0036] A probe is the term for a piece of DNA or RNA corresponding to the 

gene or sequence of interest. Here, the sequence of interest is the third tandem repeat 

in the 5' region of the TS gene and/or the 3' untranslated region of TS. The first probe 

has a sequence that is complementary to the sequence of the isolated nucleic acid of 

interest, and which selectively binds to the variant form of the tandem repeat but not to 

the wild-type form of the tandem repeat. The second probe has a sequence that is 

complementary to the sequence of the isolated nucleic acid of interest, and which 
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selectively binds to the selected form of the 3' UTR of the TS, but not to the alternate 
form. Preferably, the probe is labeled for easy detection. Labels, for example, biotin, 
digoxygenin, or fluorescein, and methods for their attachment to the probe are known 
in the art. 

[0037] Another embodiment of the present invention are primers for the nucleic 

acid comprising the TS SNP and/or the 3' UTR polymorphism, which like a probe, 
hybridizes to the sequence of interest. However, primers also allow for extension of 
the nucleic acid sequence with the addition of free nucleotides, polymerase, and other 
necessary reagents into the reaction mixture. Hybridization means selectively binding 
to a nucleotide sequence under stringent conditions. Here, the stringent conditions are 
those that permit the binding the variant 3R tandem repeat, but not the wild-type 3R 
tandem repeat, or vice versa. With respect to the 3' UTR polymorphism, stringent 
conditions are those that permit the binding of one form of the polymorphism, but not 
the other. 

[0038] Primers are used typically within an amplification procedure, such as 

PCR. For polymerase chain reaction (PCR) amplification of regions to TS gene 
containing a polymorphism, nucleoside triphosphates (dATP, dCTP, dGTP, and 
dTTP), a polymerizing agent and proper temperature, ionic strength and pH are 
required. Preferably, the primer is single-stranded and sufficiently long to allow 
synthesis via extension using the polymerizing agent. The oligonucleotide primer 
typically contains at least 8-40 nucleotides, but preferably 12-35 nucleotides. PCR 
allows for exponential amplification of a portion of nucleic acid. 
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[0039] Primers should be "substantially" complementary to the nucleic acid 

being amplified, meaning that the primers must be sufficiently complementary to 
hybridize with their respective strands and permit the amplification to occur. The 
primers may be prepared using conventional or automated phosphotriester and 
phosphodiester methods. Preferably, the primer extension is performed in the presence 
of A, C, G, and T/U nucleotide terminators, each of which is labeled with a different 
label that identifies the base contained in the terminator. A nucleotide terminator is a 
nucleotide or nucleoside that is covalently linkable to the extendible end of a primer, 
but is not capable of further extension. Preferably, the labels are fluorescent labels 
with four different emission wavelengths. 

[0040] PCR proceeds with primers to denatured nucleic acid followed by 

extension with polymerase or another enzyme and then undergoes repeated cycles of 

denaturing, primer annealing, and extension. Specific conditions for the PCR may be 

found in the "Experimental" section or are known in the art. The final amplified 

regions of TS may be detected by Southern blots with or without using radioactive 

probes. A "region" is an area from several nucleotides upstream to several nucleotides 

downstream from the specific nucleotide mentioned and also includes the 

complementary nucleotides on the antisense strand of sample DNA. Nonradioactive 

probes include a fluorescent compound, a bioluminescent compound, a 

chemiluminescent compound, a metal chelator or an enzyme. The amplification 

products can also be separated using an agarose gel containing ethidium bromide. 

[0041] Relatedly, the probe may be part of a nucleic acid array in which an 

oligonucleotide hybridizes to the sequence comprising the TS SNP and/or the 3 ' UTR 
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polymorphism. In this embodiment, an array of nucleic acid molecule targets is 

attached to a solid support. If the array is screening for the 5' TS SNP, it comprises an 

oligonucleotide that will hybridize to a nucleic acid molecule consisting of 

CCGCGCCACTTGGCCTGCCTCCGTCCCG [SEQ ID NO:l], wherein at position 

12, G is replaced by C, under conditions in which the oligonucleotide will not 

substantially hybridize to a nucleic acid molecule consisting of SEQ ID NO: 1 . If the 

array is screening for the 3' UTR polymorphism, it comprises an oligonucleotide that 

will hybridize to a nucleic acid molecule having one of the forms of the 

polymorphism. For example, the array may comprise an oligonucleotide that will 

hybridize to a molecule having a +6 bp/1494 region, but not to a molecule having a -6 

bp/1494 region. An array may also be designed where the converse is true: an 

oligonucleotide that will hybridize to a molecule having a -6 bp/1494 region, but not 

to a molecule having a +6 bp/ 1494 region. An array may have one or a plurality of 

target elements, including, but not limited to both the TS targets revealed herein. 

[00421 A different aspect of the present invention focuses on the upstream 

regulatory factors (USF-1 and USF-2) that bind to the key regions of the tandem 

repeats, called E-boxes. The present invention contemplates manipulation of the 

binding of these USF elements, both through manipulation of the USF elements 

themselves and their binding regions. For example, since USF binding leads to TS 

transcription, which in turn, leads to increased risk of cancer and cardiovascular 

disease, the USF binding elements themselves could be altered so as not to bind with 

the efficacy or frequency of wild-type USF elements. This alteration could occur 

through mutation of the nucleic acid that codes for the USF, through blocking the 
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protein assembly, or through alteration of the USF proteins function after assembly. 
Additionally, any alteration of the E-box of the tandem repeat sequences that prevents 
USF binding, specifically the G for C substitution at the 12 th nucleotide of third 
tandem repeat, is contemplated. Any alteration that prevents the binding of the USF 
factors is within the scope of the present invention. 

[0043] The next aspects of the invention relate to methods of using the novel TS 

SNP in the 5' region and/or the 3' UTR polymorphism to discover disease 
susceptibility of an individual not having a disease and optimal disease treatment 
pathways including drug selection for an individual having a disease. Further, the 
methods contemplated comprise using the polymorphisms as molecular markers and 
for linkage analysis and using genetic manipulation to better an individual's chances 
of surviving a disease or of not contracting a disease at all. The diseases focused on in 
the present invention are cancer and cardiovascular disease, although the methods of 
the invention are applicable to any other disease in which thymidylate synthase is 
implicated. 

[00441 To identify the TS polymorphisms, nucleic acid must first be extracted 

from the subject. Preferably, the subject is a human and blood is the source of the 

nucleic acid. However, any bodily fluid that contains suitable nucleic acid specimens 

is contemplated, including lymph, saliva, urine, or other bodily excretions. 

Alternatively, the nucleic acid could be derived from soft tissue, hair, or bone. When 

methods of obtaining nucleic acid from a human and determining whether the human 

has the novel TS SNP and/or the 3' UTR polymorphism are utilized, it is preferable 

that the nucleic acid is amplified and sequenced using methods well known in the 
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genetic arts, such as the PCR methods discussed throughout this disclosure. Another 
preferable embodiment of the present invention uses high throughput screening 
methods to test multiple samples at the same time. Typically in high throughput 
screening, the nucleic acid molecules to be tested are bound to a solid support, such as 
a microtiter dish, amplified and labeled, and the results read by a machine adapted to 
such use. 

[0045] Once the TS SNP and/or the 3 ' UTR polymorphism is screened for and 

has been identified or found absent in a particular patient, other methods of the 
invention are prognostic and diagnostic methods that provide for the indication of 
whether a patient will be a good candidate for chemotherapeutic and/or anti-CVD 
drugs. It has been found that high levels of TS transcription are linked to a shorter 
survival rate as compared to those patients with a lower TS transcription rate (Ulrich et 
al., 2002). With respect to the relationship between TS and cardiovascular disease, TS 
and the enzyme 5,10-methylenetetrahydrofolate reductase compete for limited 
supplies of folate required for the remethylation of homocysteine, an amino acid found 
in blood. Elevated levels homocysteine are used to identify patients at increased risk 
of CVD (Trinh et al., 2002). Thus, the relationship between TS and cancer and, 
independently, TS and CVD make the TS SNP a valuable tool for screening for (1) the 
likelihood that a given individual will develop cancer or CVD, (2) the potential 
severity of the relevant disease, and (3) treatments that are more likely to work given 
the form of the TS gene. 

[00461 For example, if a patient has two copies of the 3R wild-type form of the 

TS gene (3R/3R), then there are two USF E-boxes per allele and the transcription of 
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TS in that person will likely be higher than a person with 3R/3RV, 2R/2R, 2R/3R, or 

2R/3RV TS alleles. A physician would then try to design a useful therapy for that 

person, knowing that TS expression is high. Useful therapies might include targeting 

the TS gene to reduce TS expression and/or targeting another aspect of the disease to 

offset the high level of TS produced. If a patient were screened and found not to have 

a high level of TS transcription, then a physician might decide to go with the 

conventional treatment, which has markedly higher success rates in patients without 

high TS transcription. There are many possible ways to use the presence or absence of 

the TS SNP in conjunction with the knowledge of the tandem repeats because the 

presence of the TS SNP means that an extra copy of the tandem repeat does not confer 

higher TS transcriptional activity. Thus, those skilled in the art will be better able to 

genetically screen a person and accurately determine the level of TS transcription from 

the screen alone. Then, the novel TS diagnostic marker can then be translated into 

preferred methods of treatment for the given disease. 

[0047] A separate aspect of the present invention focuses on another 

polymorphism — a 6 bp/1494 deletion polymorphism in the 3 '-untranslated region 

(3'UTR) of the human TS gene. The present invention discovered that this 

polymorphism causes message instability and is associated with decreased 

intratumoral TS raRNA levels. Insertion of the 3'UTR of TS containing the +6 

bp/1494 polymorphism into the luciferase 3'UTR resulted in a -35% decrease in 

luciferase activity, and a similar decrease in mRNA levels, compared to the empty 

pGL3-control vector. A series of deletions of the 3'UTR of TS resulted in no 

significant differences in luciferase activity compared to the full-length 3'UTR, 
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showing that regions within the TS-3'UTR are relatively stable overall. Insertion of 
the TS-3'UTR containing the -6 bp/1494 deletion polymorphism resulted in a -70% 
decrease in luciferase activity and a -60% decrease in mRNA levels compared to the 
empty pGL3-control vector, indicating that the deletion polymorphism caused a 
decrease in mRNA stability. 

[0048] Further proving that the deletion causes instability, the TS-3 'UTR 

containing the -6 bp/1494 deletion polymorphism had a significantly higher rate of 
message degradation compared to the +6 bp/1494 construct. Measurement of 
intratumoral TS mRNA levels demonstrated that individuals homozygous for the 
insertion (+6 bp/+6 bp) polymorphism had significantly higher TS mRNA levels 
compared to individuals that were homo2ygous for the deletion (-6 bp/-6 bp) 
polymorphism (p < 0.007). Statistical analysis determined the frequency of the -6 
bp/ 1494 deletion polymorphism in a variety of ethnic populations to be 41% in non- 
Hispanic whites, 26% in Hispanic whites and 52% in African-Americans. Taken 
together, these results signify that the -6 bp/1494 deletion polymorphism in the 3 'UTR 
of TS is associated with decreased mRNA stability in vitro, lower intratumoral TS 
expression in vivo. 

[0049] Thus, knowledge of the effect of the -6 bp/1494 deletion polymorphism 

is a useful screening tool in predicting an individual's TS mRNA levels in a clinical 

setting. Because the -6 bp/1494 deletion polymorphism causes TS mRNA 

instability, a screen can be used to find individuals with this deletion polymorphism. 

The results of the screen can then be used to tailor cancer treatments and/or cancer 

prevention therapies depending on the result. For example, as explained above, an 
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individual with the deletion polymorphism has less stable TS and so is more likely to 
be responsive to therapies that target TS. Further, an individual with the deletion 
polymorphism has a lesser chance of developing cancer because there is less 
probability that the TS in the cancerous cells will be stable and be capable of robustly 
progressing and possibly spreading throughout the individual. 

[0050] The polymorphisms of the present invention may be used separately as 

screens, but are preferably used together to determine whether individuals have a 
higher likelihood of TS disruption (3R/3RV, 2R/2R, 2R/3R, or 2R/3RV with -6 
bp/1494 deletion), average likelihood of TS disruption (3R/3RV, 2R/2R, 2R/3R, or 
2R/3RV with +6 bp/1494, or (3R/3R with -6 bp/1494 deletion), or lower likelihood of 
TS disruption (3R/3R with +6 bp/1494). Using the polymorphisms in conjunction 
may allow a diagnostician a more precise estimate of TS disruption and thus, a more 
accurate idea of how that individual with either respond to cancer treatment or 
cardiovascular disease treatment. 

[0051] Drug selection is also linked to the polymorphisms because it can be 

investigated how a particular drug acts within the portion of the population possessing 

a particular TS polymorphism. If the higher TS transcription and/or activity is known 

to be associated with the reduced efficacy of a given drug, a clinician pursue other 

methods of treating the cancer and/or CVD, either instead of or in addition to the 

administration of the pharmaceutical. Conversely, if lower TS transcription and/ or 

activity is known to be associated with the increased efficacy of a given 

pharmaceutical, it would likely be advantageous to administer that pharmaceutical to a 

person matching the reduced TS profile. The methods further indicates how likely it is 
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that an individual with develop cancer and/or cardiovascular disease based on the 
probable transcription rate and stability of TS. Again, the greater the stability of TS, 
the higher the likelihood of developing cancer and/or cardiovascular disease and lower 
the likelihood of effective treatment of those diseases because stable TS allows for the 
survival and propagation of the diseased tissue. 

[0052] A further aspect of the invention are kits for carrying out the methods of 

TS polymorphism identification and screening described herein. Preferably, the kits 
will comprise primers, probes, implements for the arrays, screening arrays, and 
instructions for use. Preferably, the kits also contain the reagents, polymerase, tubes, 
and any other substance or equipment required to carry out the identification of one or 
both of the polymorphisms. 

[0053] Other aspects of the present invention contemplate using genetic and or 

protein based manipulation to control the TS transcription and or TS enzyme activity. 
If genetic manipulation is intended, vectors containing the preferred form of the TS 
gene may be introduced in vitro or in vivo to cells of the individual. Alternatively, 
host cells may be genetically engineered with vectors of the invention and produce the 
polypeptides of the invention by recombinant techniques both in vitro and in vivo, as 
well as ex vivo procedures. Introduction of the TS polynucleotides with the preferred 
polymorphisms into host cells can then be effected by methods described in many 
standard laboratory manuals (See Davis et al., Basic Methods In Molecular Biology 
(1986) and Sambrook et al., Molecular Cloning: A Laboratory Manual, 2nd Ed., Cold 
Spring Harbor Laboratory Press, Cold Spring Harbor, N.Y. (1989)). The use of 
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vectors, preferably targeted recombinant viral vectors, is well known in the art (See, 
for example, USPN 6,635,476). 

[0054] The vectors should incorporate relevant promoters, enhancers, and the like to 
aid the alteration of the TS sequence. Promoter regions can be selected from any 
desired gene with selectable markers. Two appropriate vectors are pKK232-8 and 
pCM7. Particular named bacterial promoters include lad, lacZ, T3, T7, gpt, lambda 
PR, PL and trp. Eukaryotic promoters include CMV immediate early, HSV thymidine 
kinase, early and late SV40, LTRs from retrovirus, and mouse metallothionein-1. 
Selection of the appropriate vector and promoter is well within the level of ordinary 
skill in the art. 

[0055] A further aspect of the invention is the use of antibodies to the TS enzymes to 
reduce the activity of the TS enzyme. Preferably, the antibodies are targeted and 
immunospecifically bind to the TS enzymes with the highest TS activity. Polyclonal or 
monoclonal antibodies directed towards the polypeptide encoded by TS may be prepared 
according to standard methods. Monoclonal antibodies may be prepared according to general 
hybridoma methods of Kohler and Milstein, Nature (1975) 256:495-497), the trioma 
technique, the human B-cell hybridoma technique (Kozbor et aL, Immunology Today (1983) 
4:72) and the EBV-hybridoma technique (Cole et aL, Monoclonal Antibodies And Cancer 
Therapy, pp. 77-96, Alan R. Liss, Inc., 1985). Antibodies utilized in the present invention 
may be polyclonal antibodies, although monoclonal antibodies are preferred because they 
may be reproduced by cell culture or recombinantly, and may be modified to reduce their 
antigenicity. Polyclonal antibodies may be raised by a standard protocol by injecting a 
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production animal with an antigenic composition, formulated as described above. (See, e.g., 
Harlow and Lane, Antibodies: A Laboratory Manual, Cold Spring Harbor Laboratory, 1988.) 
[0056] Alternatively, for monoclonal antibodies, hybridomas may be formed by isolating the 
stimulated immune cells, such as those from the spleen of the inoculated animal. These cells 
are then fused to immortalized cells, such as myeloma cells or transformed cells, which are 
capable of replicating indefinitely in cell culture, thereby producing an immortal, 
immunoglobulin-secreting cell line. The immortal cell line utilized is preferably selected to 
be deficient in enzymes necessary for the utilization of certain nutrients. Many such cell lines 
(such as myelomas) are known to those skilled in the art, and include, for example: thymidine 
kinase (TK) or hypoxanthine-guanine phosphoriboxyl transferase (HGPRT). These 
deficiencies allow selection for fused cells according to their ability to grow on, for example, 
hypoxanthhie aminopterinthymidine medium (HAT). The antibodies may be administered 
parenterally, intravenously, or orally. 

[0057] These and other embodiments of the inventions will be apparent from the description 
of the experiments. 

H. EXPERIMENTS 

Overview of the 5' SNP Experiment Series 
[0058] The following examples are meant to illustrate the present invention and 

are not limitations upon it. All citations throughout the disclosure are incorporated by 
reference and found in a complete listing at the end of the written description. 
[0059] Thymidylate synthase (TS) gene expression is modulated in part by a 

polymorphism in the 5' regulatory region of the gene. The polymorphism consists of 
either two repeats (2R) or three repeats (3R) of a 28 bp sequence, yielding greater TS 
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gene expression and protein levels with a 3R genotype. The sequence of the third 
repeat is a 28 base pair sequence of CCGCGCCACTTGGCCTGCCTCCGTCCCG, 
designated SEQ ID NO:l. Two USF family E-box consensus elements are found 
within the tandem repeats of the 3R genotype and one within the 2R genotype. These 
elements bind USF protein complexes in viti'o by electrophoretic mobility shift assays 
(EMSA) and in vivo by ChIP assay. The present disclosure shows that the additional 
USF consensus element within the 3R construct confers greater transcriptional activity 
relative to the 2R construct. The present invention demonstrates that mutagenesis of 
the USF sites shows that the transcriptional regulation of TS is dependent on USF 
proteins binding within the tandem repeats. 

[0060] The identification of a novel G— >C single nucleotide polymorphism in 

the second repeat of 3R alleles within the USF consensus element alters the ability of 
USF proteins to bind to the mutated site, and thus alters the transcriptional activation 
of TS genes bearing this genotype. Through RFLP analysis, the frequency of this 
polymorphism (3RV) was determined to be 56% of all 3R alleles in healthy Non- 
Hispanic White individuals. A single nucleotide polymorphism is a DNA sequence 
variations that occurs when a single nucleotide (A, T, C, or G) in the genome sequence 
is changed. Screening for the SNP in combination with the tandem repeat 
polymorphism significantly increases the value of the tandem repeats alone in 
predicting response and survival to 5-FU/LV chemotherapy treatment. The more non- 
variant 3R copies of the TS allele that a subject has, the worse the response to 
chemotherapy drugs. Therefore, this novel SNP of the 5' tandem repeat polymorphism 
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can be used as a predictor of clinical outcome to thymidylate synthase inhibitors and 
other chemotherapeutic agents. 

[0061] Thus, the present invention characterizes the mechanism of 

transcriptional activation from the tandem repeats and describes how an additional 28 
bp repeat can enhance transcriptional activity. The present invention also identifies a 
highly penetrant single nucleotide polymorphism within the 3R that can abolish its 
increased transcriptional activity relative to the 2R, and show that sequence variations 
within the tandem repeats have functional significance. 
[0062] In order to determine what regulatory factors were involved in 

transcriptional activation from the tandem repeats, sequence consensus elements 
within the 28 base pair regions were investigated. An E-box site, CACTTG, lies 
within the middle of the first and second repeats of the 3R polymorphism and within 
the first repeat of the 2R polymorphism (See Fig. 1). EMS A analysis using nuclear 
extracts with competitor oligonucleotides identified USF complexes bound to this 
element in vitro. However, only USF-1 that had been phosphorylated by cdc2/p34 
was bound to its consensus element within the tandem repeats. CMP analysis shows 
the presence of both USF-1 and USF-2 at the TS 5' regulatory region in vivo. The 
region amplified in the ChIP assay contained only the putative E-box sites located 
within the tandem repeats and shows that USF-1 and USF-2 were bound to these sites 
in vivo. 

[0063] Through site-directed mutagenesis, it is shown that USF consensus 

elements within the first repeat of either the 2R or 3R constructs are necessary for 

efficient transcriptional activation of the luciferase reporter gene constructs. These 
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assays also showed that an extra repeat in the 3R construct adds an additional USF 
binding site that leads to increased transcriptional activity compared to the 2R 
construct, in the absence and presence of exogenous USF-1. Thus, the enhancer 
function of the tandem repeats at the transcriptional level increases as the number of 
USF E-box sites increase. Although USF-2 did not activate the TS promoter 
constructs significantly, the presence of USF-2 in the ChIP assay, and the fact that 
these proteins exist as heterodimers to a large extent in the cell, suggests that USF-2 
may be present in complex with USF-1 at the TS tandem repeats in vivo 
[0064] It has been postulated that the number of tandem repeats in the TS gene 

itself determines the level of TS expression. However, the present invention modifies 
-this theory with the identification of an unexpected and novel SNP within the tandem 
repeats that alters the enhancer function of an extra repeat. A single G— »C base 
change found at the 12 th nucleotide of the second repeat in the 3R genotype, alters a 
critical residue in the USF consensus element. Thus, it is not the number of tandem 
repeats alone, but the number of functional tandem repeats that determines the level of 
TS transcription. An EMSA assay shows that this base change abolishes the ability of 
USF complexes to bind within the repeat and effectively eliminates the E-box site. A 
3R construct bearing this variation, 3RV, was isolated from patient genomic DNA and 
used in luciferase reporter assays to analyze the effects of this polymorphism on 
transcription. The 3RV construct displayed a similar transcriptional activity as a 2R 
construct (Fig. 5B). These results suggest that the addition of a 28 bp repeat alone is 
not sufficient for enhanced transcriptional activity of the TS gene, but that a USF E- 

box element is required within the extra repeat in order to enhance transcription. 
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[0065] This experiment revised the previous PCR based method for determining 

tandem repeat polymorphism genotype (Horie et al., 1995) into a restriction fragment 
length polymorphism (RFLP) technique that includes a screen for the G-^C SNP. A 
smaller PCR fragment is amplified (to remove extraneous HaelU sites) and half of the 
sample is left undigested while the other half is digested with the HaelTL restriction 
enzyme. When patient samples are run side-by-side on an agarose gel, the tandem 
repeat polymorphism as well as the SNP can be determined for both alleles. The 
frequency of the SNP in 99 colorectal cancer patients (Table 1) was determined using 
this novel method. 

Table 1: Distribution of the 5'-TS tandem repeat polymorphism and the novel 
G-*C polymorphism in the second repeat of the 3R among 99 Non-Hispanic 
White individuals 



Genotype 


2R/2R 


. 2R/3R 


2R/3RV 


3R/3R 


3R/3RV 


3RV/3RV 


Total 


Number 


19 


13 


31 


11 


16 


9 


99 


Allele 

2R-Allele 

3R-Allele 


38 


13 
13 


31 
31 


22 


32 


18 


Frequency 

.414 1 

.586" 


3RV-Allele 






31 




16 


18 


.560 2 



' Frequency of allele is shown as percentage of all alleles. 

2 ' Frequency of allele is shown as percentage of 3R alleles only. 

[0066] The SNP is easily screened for with the addition of a simple restriction 

digestion and generates useful information for clinicians in order to tailor individual 

chemotherapy in respect to both tumor response and host toxicity in relation to cancer 

treatment. In relation to CVD treatment, clinicians can determine if a subject is at a 
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. higher risk for CVD by looking at the number of non-variant alleles. The clinician can 
then tailor the therapy accordingly, noting that the levels of folate will likely be lower 
and homocysteine will likely be higher than in subject with more 2R and/or 3RV 
alleles, thus making that individual more susceptible to CVD. 

[00671 The regulation and functions of USF proteins add further complexity to 

the TS-inhibition pathway and to the formation and progression of carcinogenesis. 
The USF proteins have been traditionally described as ubiquitous regulatory factors 
but recent evidence has shown that these proteins can be misregulated in some forms 
of cancer (Kawakami et al., 1 999) and are overexpressed during periods of 
malnutrition, particularly protein-free diets (Matsukawa et al., 2001). Further, USF-1 
is activated by the stress-responsive p38 kinase. It has been postulated that this 
activation provides a link between stress stimuli and the subsequent changes in gene 
expression that occur as a result of treatment with stress-inducing agents (Galibert et 
al., 2001), possibly including chemotherapeutic agents. Thus, it can be hypothesized 
that overexpression of USF proteins could cause increased activation of genes targeted 
by USF-l/USF-2 complexes, thereby implicating the USF proteins as mediators of TS 
overexpression in vivo. USF overexpression could also lead to TS overexpression 
indirectly, through activation of the tumor-suppressor p53 (Reisman et al., 1993), 
which transactivates the TS promoter. Based on this evidence, USF-1 and USF-2 
could play a role in causing the drug resistance seen in patients treated with TS 
inhibitors through direct and indirect mechanisms. The present invention 
contemplates blocking USF binding sites on the TS gene so that, even if USF were 
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overexpressed, the TS E-boxes would be competitively bound by a non-TS activating 
substance. 

[0068] Conversely, loss of USF function could contribute to carcinogenesis 

(Ismail et al., 1999). Genetic alterations in APC are thought to be one of the earliest 
steps in colon carcinogenesis (Ichii et al., 1992) and loss of APC gene function has 
been correlated with increased c-Myc oncogene activity (Erisman et al., 1985; Jaiswal 
et al., 1999). USF-l/USF-2 complexes have been shown to transactivate the APC gene 
(Jaiswal et al., 2001). Since USF proteins antagonize the effects of c-Myc (Luo et al., 
1996), it has been proposed that loss of USF function could cause down regulation of 
APC leading to increased c-Myc expression and enhanced cellular proliferation 
(Pullarkat et al., 2001). 

[0069] Here, the present invention provides evidence for a direct role of USF 

proteins in the regulation of TS gene expression and suggests that the inhibition of 
USF activity or USF binding sites could also be considered as a modulating therapy 
for TS-directed anti-cancer drugs. Based on the results, the fact that the novel SNP of 
the present invention alters the ability of the repeats to function as enhancers of 
transcription explains discrepancies in response to 5-FU treatment. Considering the 
importance of the TS reaction in folate metabolism, the novel polymorphism may have 
additional influence in the modulation of other folate dependent pathways. In addition 
to thymidylate biosynthesis, piirine synthesis, methionine regeneration, and other one- 
carbon donor reactions, such as those involved in DNA methylation, are all influenced 
by this polymorphism. Here, the present invention demonstrates that a transcriptional 

component within the tandem repeats exists and proves that this component is altered 
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by differences in the nucleotide sequence of the repeats. The present comprehensive 
analysis of both polymorphisms contributes to a more precise prediction of TS gene 
expression and clinical outcome to fluoropyrimidines and other chemotherapeutic 
drugs, and to predicting and treating CVD. 

1. The 28 bp Tandem Repeats in the 5' Regulatory Region of the 
Human TS Gene are Not Identical in Their Nucleotide 
Sequences 

[0070] The published sequence of the human TS gene and its 5' upstream 

regions (Takeishi et al., 1989) shows that there are two single base changes in the last 
28 bp repeat of both the 2R and 3R genotypes, and recent evidence has shown that 
these sequence differences exist in the last repeats of the 4R and 5R alleles as well 
(Luo et al., 2002). The consequences of these base changes on TS gene expression, as 
well as the frequency of these base changes, have not been examined. Thus, this 
experiment sought to verify the presence of sequence differences within the repeats, 
and to look for other base changes and potential polymorphisms. By direct sequencing 
of 14 human genomic DNA samples, the experiment verified the presence of the two 
base changes in the last repeats of 2R and 3R, and identified a novel single nucleotide 
polymorphism within the second repeat of 3R (Fig. 1, asterisks). 
[0071] Figure 1 is the structure of a tandem repeat polymorphism within the 5'- 

untranslated region of the human TS gene. An enhancer polymorphism in the 5'- 
untranslated region of the thymidylate synthase gene consists of either two or three 28 
bp repeats. The third repeat is SEQ ID NO: 1 . The nucleotide sequence of these 
repeats is shown above and bears variations within each repeat. A putative E-box 

31 

BNSDOCID: <WO 2OO4037B52A2_i_> 



WO 2004/037852 



PCT/US2003/033441 



binding site for upstream stimulatory factor (USF-l/USF-2) has been identified and is 
underlined and bolded within each repeat. The consensus sequence for USF DNA 
binding is CANNTG (where N is any nucleotide). Repeats one and two of 3R and 
repeat one of 2R, contain USF consensus elements (underlined) while the last repeat in 
either construct contain an imperfect or variant consensus sequence due to a G-*C 
base change (asterisks) that disrupts the putative E-box. The last nucleotide of the 
final repeat in 2R and 3R also bears a G-+C base change. 

[0072] In order to determine if these base changes exert a functional role on 

gene expression, it was first sought to identify regulatory factors that bind within the 
28 bp TS tandem repeats. Both the 28 bp sequence lacking and the 28 bp sequence 
bearing the base changes were scanned for putative transcription factor binding sites 
using the TRANSFAC database (Wingender et aL, 2000). A USF E-box consensus 
element (CACTTG) was found within the first repeat of the 2R genotype and within 
the first two repeats of the 3R genotype, but not in the last repeat of either genotype 
(Fig. 1). The C at the 12 th nucleotide of the last repeat of 2R and 3R lies within the 
USF consensus sequence element at a critical nucleotide for USF binding. The 
potential SNP at the 12 th nucleotide in the second repeat of 3R changed the USF 
consensus element in a similar fashion (Fig. 1, shaded nucleotide). These results show 
that USF regulatory factors can bind to sequences within the TS tandem repeats. 
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2. Phospho-USF-l Binds to Consensus Elements Within the TS 
Tandem Repeats But Not To Repeats Containing the G-»C 
Base Change at the 12th Nucleotide 

[0073] To determine the sequence-specific binding of USF proteins to the TS 

tandem repeats in vitro, a 28 bp sequence bearing the putative USF consensus E-box 
element was used as a probe in electrophoretic mobility shift assays (EMSA). Figure 
2 is a EMSA showing that USF proteins bind to the E-box site within the tandem 
repeats of the human TS gene in HT29 nuclear extracts. Gel mobility shift analyses 
were performed using HT29 nuclear extracts with a 32 P-labeled 28 bp probe 
corresponding to a tandem repeat sequence containing an intact E-box site. In Figure 
2, lane 1 is free probe. In lane 2, 2.5 ug of HT29 nuclear extracts were incubated with 
probe in the absence of unlabeled competitor oligonucleotide, resulting in the presence 
of numerous band shifts on the gel. 

[0074] The addition of an unlabeled specific USF competitor oligonucleotide to 

the reaction resulted in the absence of two bands, which were again present when non- 
specific competitor was added (Fig. 2, lanes 3-6). In lanes 3-4, extracts were pre- 
incubated with unlabeled specific competitor oligonucleotides to USF-1 in increasing 
molar excess. In lanes 5-6, extracts were pre-incubated with unlabeled non-specific 
competitor poly (dldC) in increasing molar excess. Arrows indicate USF protein 
complexes. These competition experiments show sequence-specific binding of USF 
complexes to the 28 bp tandem repeat sequences containing intact USF E-box 
elements. 

[0075] Since USF-1 shows increased affinity for its DNA consensus element 

when it is phosphorylated, the ability of the phosphorylated and unphosphorylated 
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forms of USF-1 to bind to the putative consensus element within the 28 bp repeat 
sequence was tested. Recombinant USF-1 was expressed in E. coli with a 6-histidine 
tag and purified on a Ni-NTA column. Cdc2/p34 was immunoprecipitated from HeLa 
S3 cells and used in an in vitro kinase reaction to phosphorylate 200 ng of 
recombinant USF-1 (Fig. 3 A). 32 P-labeled ATP was used in the control reaction for 
visualization of phosphorylation after exposure of the film (right panel). 
[0076] When both forms of USF- 1 were used in an EMS A assay utilizing the 

perfect USF-1 consensus element as a probe, only the phosphorylated form was able to 
bind (Fig. 3B, lanes 2 and 3). Gel mobility shift analyses were performed using 
recombinant USF-1 with a 32 P-labeled USF-1 specific consensus probe containing an 
intact E-box site. Lane 1 contained only free probe. Lane 2 was 30 ng of recombinant 
phospho-USF-1 incubated with probe in the absence of unlabeled competitor 
oligonucleotides. Lane 3 contained 30 ng of recombinant unphosphorylated USF-1 
incubated with probe in the absence of unlabeled specific competitor oligonucleotides. 
In lanes 4-5, phospho-USF-1 was pre-incubated with 500 molar excess of unlabeled 
USF-1 specific competitor oligonucleotide and 500 molar excess of nonspecific dldC 
competitor, respectively. 

[0077] To determine the ability of the phosphorylated form of USF-1 to bind to 

its consensus element within the TS repeat, an EMSA assay was carried out using the 

32 P-labeled 28 bp sequence as a probe corresponding to one tandem repeat containing 

an intact E-box site. Lane 1 was free probe. In lane 2, 30 ng of recombinant USF-1 

was incubated with probe in the absence of unlabeled competitor oligonucleotides. In 

lane 3, 30 ng of phospho-USF-1 was incubated with probe in the absence of unlabeled 
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competitor oligonucleotides. In lanes 4-6, phosphb-USF-1 was pre-incubated with 
500 molar excess of: unlabeled probe, USF-1 specific competitor, and non-specific 
poly (dldC) competitor oligonucleotides, respectively. Incubation of the 
phosphorylated form of USF-1 with the probe caused a shift on the gel that was 
abolished by the addition of unlabeled specific competitor oligonucleotides (Fig. 3C, 
lanes 3 and 5). This data further proves that only the phosphorylated form of USF-1 
can bind its consensus element within the tandem repeats. 

[00781 Since the potential G-*C SNP at the 12 th nucleotide of the 28 bp repeats 

lies within the USF binding site, the ability of the recombinant USF-1 protein to bind 
the variant consensus element by EMSA was tested. Neither the unphosphorylated or 
^phosphorylated forms of USF-1 showed any affinity to this variant sequence (Fig 3D, 
Tight panel). Gel mobility shift analyses were performed using recombinant USF-1 
with a 32 P-labeled 28 bp probe corresponding to one tandem repeat containing a G->G 
base change at the 12 th nucleotide. Lane 1 was free probe. In lane 2, 30 ng of 
recombinant USF-1 was incubated with probe. In lane 3, 30 ng of phospho-USF-1 
was incubated with probe. This data shows that the potential SNP within the tandem 
repeats abolishes USF binding by disrupting the USF consensus E-box element. 

3. USF-1 and USF-2 Bind to the Thymidj late Synthase Tandem 
Repeats in vivo 

[0079] The results of the in vitro assays show sequence-specific binding of 

USF-1 and USF-2 to the tandem repeats of the thymidylate synthase gene at E-box 

consensus sites. To determine if USF-1 and USF-2 were bound to these elements in 

vivo, a chromatin immunoprecipitation (ChIP) assay using live 293 (human embryonic 
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kidney) cells was performed using genomic DNA from 1 x 10 6 antibodies for USF-1 
and USF-2. Input DNA was a 20 pi aliquot of DNA taken before addition of 
antibodies and the no-antibody control was performed along side USF-1 and USF-2 
immunoprecipitations without the addition of antibody. After formaldehyde cross- 
linking of proteins to DNA and shearing of genomic DNA by sonication, 
immunoprecipitations using USF-1 and USF-2 antibodies were performed. The 
immunoprecipitations included a control reaction, which was performed without the 
antibodies. 

[0080] After the pull downs, PCR amplification was performed at 64.8°C to 

determine if the TS 5' regulatory region containing the tandem repeats (+15 to +195 
relative to the transcription start site) was bound by USF-1 or USF-2. The PCR 
product was then ethanol precipitated and electrophoresed on a 1 .5% agarose gel. The 
1 80 bp fragment was amplified from the immunoprecipitations using USF-1 and USF- 
2 polyclonal antibodies but was not present in the control reaction lacking antibody 
(Figure 4). These results show the presence of USF-1 and USF-2 on the chromatin at 
the TS locus, which includes the tandem repeats and E-box elements. This particular 
region of DNA contains no other putative E-box elements other than those located 
within the tandem repeats. The presence of USF-1 and USF-2 at the TS 5' regulatory 
region showed that these proteins bind to the E-box elements located within the 
tandem repeats. These data led to the examination of the potential role of these 
proteins in activating transcription of TS 5' regulatory region reporter constructs. 
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4- USF-1 Transactivates the TS Promoter Through Binding of 
Tandem Repeats Containing E-box Elements 

[0081] To examine the ability of USF-1 and USF-2 to .enhance transcription 

through binding within the tandem repeats, the 5' promoter region of the human TS 
gene from -313 to +195 (relative to TS transcription start), including the 5' 
untranslated region, was cloned into the TATA-less pGL3 -Basic luciferase reporter 
vector just upstream of the luciferase translation start site. Both 2R and 3R constructs 
were individually cloned into the vector and 2RmutUSF and 3RmutUSF were created 
by altering the indicated USF consensus elements through site-directed -mutagenesis. 
Figure 5A is a diagram of those two TS luciferase reporter constructs. The 3RV 
construct lacks an E-box element in the second repeat due to a G-*C SNP 
polymorphism. All E-box elements are labeled USF and all variant or mutant 
elements are labeled with an X. 

[0082] These constructs were co-transfected into 293 cells along with either a 

USF-1 expressing vector, a USF-2 expressing vector, or an empty vector. Results 
from these experiments show that there was an increase in relative luciferase activity 
from both the 2R and 3R constructs in the presence of USF-1 (Fig. 5B). This 2-3 fold 
increase in transcriptional activity is consistent with previous reports of activation by 
USF. USF-2 activation led to a modest increase in relative luciferase activity. The 3R 
construct had greater luciferase activity than the 2R construct in both the absence and 
presence of exogenous USF-1 protein expression and this difference between 2R and 
3R transcriptional activity is consistent with previous reports hi a similar luciferase 
system. These differences are significant because subtle differences in TS gene 
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expression have been shown to be significant in predicting response to 5-FU in vivo 
(Lenz et al., 1996). Both 2RmutUSF and 3RmutUSF showed dramatically decreased 
transcriptional activity below endogenous levels of transcription, compared to their 
wild-type counterparts, indicating that these USF sites, one in the 2R and two in the 
3R, are critical to TS promoter activation. Consequently, these sites may be 
responsible for greater transcriptional activity from the 3R overall. 
[0083] Since the single G— >C base change at the 12 th nucleotide of the 28 bp 

repeats can abolish the ability of USF proteins to bind to this site by EMS A, it was 
desirable to determine whether this base change would alter the ability of USF- 1 to 
transactivate the 3R TS promoter construct. The 3R variant (3RV) reporter construct 
(Fig. 5A) had decreased transcriptional activity compared to the 3R, in the absence 
and presence of exogenous USF-1. In addition, 3RV had a similar ability to 
transactivate the luciferase reporter gene as the 2R construct (Fig. 5B). These data 
show that the ability of the tandem repeats to enhance transcription increases only as 
the number of USF consensus elements increase, and not necessarily as tandem 
repeats increase. Hence, the potential SNP within the second repeat of 3R is a 
determinant of the ability of the 3R construct to act as an enhancer of transcription, 
relative to the 2R construct. Overall, USF-2 activation led to a modest increase in 
relative luciferase activity alone, in USF-1 and USF-2 co-transfections showed no 
increase in luciferase activity compared to USF-1 alone. 
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5. Characterization of a Novel Single-Nucleotide Polymorphism 
(SNP) by Restriction Fragment Length Polymorphism 
(RFLP) Analysis 

[0084] To determine the frequency of the potential SNP in a large population, 

an RFLP analysis was developed. Figure 6A is a diagram of the HaeUl restriction 
map of the TS tandem repeat fragments produced in this RFLP analysis. This map 
shows the HaelLl restriction endonuclease sites within the fragments produced by 
polymerase chain reaction (PCR) for the RFLP analysis. 

[0085] . PCR was carried out using genomic DNA samples from 99 healthy Non- 

Hispanic White individuals, yielding PCR fragments of 213 bp for 2R alleles, 241 bp 
for 3R alleles and both fragments for 2R/3R heterozygotes. TS genotypes could be 
obtained from 99 samples. The G->C base change in the 3RV removes a HaeUI 
restriction endonuclease site and changes the banding pattern of the digested PCR 
fragment on a 3% sea-plaque agarose gel in 0.5 X TBE (Fig. 6B, undigested samples). 
Half of the PCR products were digested with the Haelll restriction enzyme and half 
were left undigested. The arrows point to the additional 92 bp fragment that is present 
in wild type samples, but is absent in samples positive for the G— >C polymorphism. 
Genotypes are listed above corresponding lanes showing repeat polymorphism (2 or 3) 
and G-^C SNP polymorphism (V for variant). 

[0086] Digested and undigested PCR products from each patient were run in 

adjacent lanes to determine the repeat polymorphism genotypes and the G— >C SNP 

genotypes of each allele. Running undigested product next to digested product was 

necessary since there are. similar banding patterns for 2R/2R, 2R/3R, and 3R/3R as 

well as for 2R/3RV and 3R/3RV when they are digested with the enzyme. In some 
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samples, non-specific DNA product was observed at ~100bp in length in the 
undigested samples. This non-specific DNA resulted in the presence of a -60 bp band 
in the HaeHL digested samples that did not interfere with interpretation of the 
genotype. Nevertheless, a single PCR reaction followed by digestion of half the 
sample with HaeHL, yielded patient genotypes for the tandem repeat polymorphism 
and the SNP within the tandem repeats. 

[0087] The G—>C SNP at the 12 th nucleotide was observed only in the second 

repeat of 3R genotypes. The frequency of the 3R among the 100 Caucasian 
individuals was 58.6% and consistent with earlier reports in Caucasians. The 
frequency of the novel G— »C SNP at the 12 th nucleotide in the second repeat of the 3R 
was 56% among all 3R carriers. This data suggests that the G-»C base change at the 
12 th nucleotide of the second repeat of 3R alleles is a highly penetrant polymorphism 
among Non-Hispanic Whites. 

6. The SNP Significantly Increases the Value of the TS Tandem 
Repeats in Predicting Response and Survival to 5-FU/LV in 
Colorectal Cancer 

[0088] To explore the role of SNP as a predictive marker, 40 patients were 

evaluated with disseminated colorectal cancer (SWOG 9420 and 3C-92-2) for 
response and survival to protracted infusion of 5-fluorouracil. The distribution of the 
TS tandem repeat polymorphism was as follows: 2R/2R 20% (8/40), 2R/3R 50% 
(20/40), and 3R/3R 30% (12/40). Patients confirmed for the 2R/2R genotype had a 
50% (4/8) response to 5-FU as compared to 15% (3/20) in the 2R/3R group. No 
patient with a 3R/3R genotype showed disease response (0/12). However, this 
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association did not reach statistical significance (P=0.089, Fisher's Exact test). 
Patients possessing the 2R/2R genotype showed a median survival of 16.2 months 
compared to 7.4 months in the heterozygous group and 8.4 months for 3R/3R carriers, 
respectively. This relationship also lacked statistical significant (P=0.14, Logrank 
Test). 

[0089] Patient samples were re-classified into two groups based on predicted 

high and low TS expression using the tandem repeat polymorphism and the SNP. 
Since it was hypothesized that the SNP, or variant 3R (3RV) allele, would decrease the 
TS gene expression of a 3R allele, 21 patients were grouped with genotypes of 2R/2R, 
. 2R/3RV and 3RV/3RV into the predicted "low TS expression" group (Group A) and 
19 patients with 2R/3R, 3R/3RV, and 3R/3R into the predicted "high TS expression" 
group (Group B). These groups were then re-evaluated for an association between TS 
genotypes and clinical outcome to 5-FU chemotherapy. 

[0090] Patients possessing one of the low-expression TS genotypes showed an 

improved response rate to chemotherapy. Thirty-three percent (33%) (7/21) of 
patients in Group A showed disease response, compared to 0% (0/19) of the patients in 
Group B. Sixty-three percent (63%) (12/19) of patients in Group B showed disease 
progression compared to only 48% (10/21) in Group A (P=0.019, Fisher's Exact Test) 
(Table. 2). In addition, study participants of Group A demonstrated a superior survival 
of 10.1 months compared to only 7.4 months for patients of Group B (P=0.035, 
Logrank Test). 
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Table 2. Association between TS genotypes and clinical outcome to 5-FU for 
colorectal cancer patients 



TS-Genotype 


Clinical Response 






2R/2R (n=8) 
2R/3R (n=20) 
3R/3R(n=12) 


Response 
4 (50%) 
3 (15%) 
0 (0%) 


Stable Disease 
1 (13%) 
5 (25%) 
5 (42%) 


Progression 
3 (37%) 
12(60%) 

/ o /o J 


P=0.089' 








2R/2R 7 (33%) 
2R/3RV 

3RV/3RV(n=21) 


4 (19%) 


10 (48%) 


2R/3R 
3R/3R 

3R/3RV(n=19) 


0 (0%) 


7 (37%) 


12 (63%) 


P=0.019' 









1 Based on Fisher's Exact Test 



[0091] Comparative analysis of these data indicates that screening for the SNP 

in combination with the tandem repeat polymorphism is more accurate and effective 
for predicting clinical outcome to 5-FU based on TS genotype analysis. Including a 
genotype for the SNP and regrouping patients based on predicted TS expression 
significantly increased the predictive value of the tandem repeats alone in response to 
5-FU/LV chemotherapy (p=0.089 v. 0.019 with SNP), and overall survival (p=0.14 v. 
0.035 with SNP). 
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OVERVIEW OF THE 3' UTR EXPERIMENT SERIES 

[0092] The experiments characterized the 3 'UTR of TS, and determined the 

effects of a -6 bp/1494 deletion polymorphism on TS mRNA stability/and or 
translational efficiency. Using a luciferase based assay system to determine stability, 
the experiments showed that the entire 3 'UTR of TS was relatively stable overall and 
contained no elements that caused significant instability or translational repression. It 
was also determined that the -6 bp/1494 deletion polymorphism was associated with 
decreased mRNA stability and an enhanced rate of mRNA decay. In addition, the -6 
bp/1494 deletion polymorphism is of predictive value in determining the TS mRNA 
levels of a given individual and that this polymorphism is relatively common, and 
varies greatly among different ethnic populations. Thus, it is an excellent candidate 
for use in various cancer screens. 

[0093] The -6 bp/1494 polymorphism is associated with decreased TS mRNA 

levels, which leads to TS protein levels being affected similarly. The results are 
consistent with a recent study involving Japanese patients with rheumatoid arthritis 
(RA). RA patients that were homozygous for the deletion polymorphism (-6 bp/-6 bp) 
had a significantly higher incidence of >50% improvement in serum C-reactive protein 
levels. This indicates response (Nozoe, 1998 and 2001) after treatment with low-dose 
methotrexate, than individuals bearing any +6 bp alleles (Kumagai, 2003). Another 
related study screened for the tandem repeat polymorphism along with the newly 
identified functional Gl 16C SNP that lies within the tandem repeats. That functional 
SNP was shown to improve the value of the tandem repeats alone in predicting 
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outcome of patients with metastatic colorectal carcinoma treated with a 5-FU based 
chemotherapy regimen. These findings are consistent with the fact that patients with 
lower TS expression may be more sensitive to methotrexate, an indirect TS inhibitor, 
than individuals with higher TS expression. 

[0094] The functional -6 bp/ 1494 deletion polymorphism is a candidate for use 

as a predictor of TS gene expression. Interestingly, a significant linkage 
disequilibrium was found between the 5 5 triple tandem repeat genotype (3R/3R) and 
the -6 bp/-6 bp 1494 deletion genotype in the Japanese RA study population 
(Kumagai, 2003). In addition, an earlier study showed the first evidence of linkage 
disequilibrium between the tandem repeat polymorphisms and -6 bp deletion 
polymorphisms (Ulrich, 2002). 

[00951 In this study, a significant linkage disequilibrium was found between 5' 

double tandem repeat (2R/2R) genotypes and +6 bp/+6 bp 1494 genotypes in a 

Caucasian population. These findings are significant due to the influences that these 

polymorphisms have on TS gene expression and may help explain discrepancies 

observed when screening individuals for the tandem repeat polymorphism alone. For 

example, an individual that is homozygous for the 2R tandem repeat polymorphism 

might display relatively high TS gene expression. This would give the false 

impression that the 2R polymorphism was associated with high TS expression and 

possibly increased resistance to 5-FU. Since the 2R tandem repeat polymorphism 

occurs much more frequently with the +6 bp/1494 polymorphism that stabilizes TS 

mRNA, screening for both polymorphisms, in conjunction with the recently identified 

Gl 16C SNP within the tandem repeats resolves many of the discrepancies in 
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correlating genotypes with TS gene expression. Further, the inclusion of the -6 
bp/1494 deletion polymorphism in screens using the tandem repeats and Gl 16C SNP 
may improve the prognostic value of the test in future clinical studies. 

1 . The 3'UTR of TS is Stable and Contains No Detectable 

Elements ofmRNA Instability or Translational Silencing. 

[0096] In order to determine the function of the -6 bp/ 1494 polymorphism, 

regulatory elements within the entire 3'UTR of TS were characterized. Since the +6 

bp/ 1494 allele is more common in the sample population from which the constructs 

were amplified, the analysis began with the TS 3'UTR containing the +6 bp/1494 

polymorphism. Various reporter constructs were created by inserting regions of the 

human TS-3'UTR into the 3'UTR of the luciferase gene, in the pGL3-control plasmid 

(Fig. 7A). By inserting the 3'UTR of TS into the unique Xbal restriction site of the 

plasmid, each reporter construct was controlled by the SV40 promoter and each 

contained an SV40 late poly (A) signal downstream of the luciferase 3'UTR. Since 

each reporter construct differed only by the TS-3'UTR regions that were inserted, 

changes in luciferase activity should be due to altered post-transcriptional regulation. 

[0097] 293 cells were transiently transfected with each reporter construct, 

incubated overnight for reporter gene expression, and assayed for luciferase activity. 

All luciferase values are expressed as a percentage of the luciferase activity from the 

pGL3-control construct that contained no regions of the TS-3'UTR. Luciferase 

activity from this construct was designated as 100% activity. Inserting the full length 

3'UTR of TS (1-495) into the luciferase 3'UTR resulted in a -35% decrease in 

luciferase activity (Fig. 7B, black bars). The decrease in luciferase activity was 
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expected since the luciferase mRNA is a highly stable transcript on its own, and a 
similar decrease in luciferase activity compared to control activity has been shown 
previously in other 3'UTR studies using this system (Cok, 2001; Giles, 2003). Serial 
deletions from the proximal and distal ends of the TS-3'UTR resulted in similar 
decreases in luciferase activity overall, and had no additional effect on luciferase 
activity compared to the full-length (1-495) construct (Fig. 7B, black bars). 
[0098] In order to determine whether the observed changes in luciferase activity 

were due to changes in mRNA stability or translational efficiency, luciferase mRNA 
was quantified. If changes in luciferase activity correlated with changes in luciferase 
message levels, then alterations seen by insertion of the TS-3'UTR would likely be 
due to changes in message stability. If luciferase activity did not correlate with 
luciferase message levels, then changes would likely be due to alterations in 
translational efficiency. 

[0099] Cells were lysed after transfection, and total RNA was quantified and 

used for semi-quantitative RT-PCR. Luciferase mRNA levels were normalized to 

GAPDH mRNA levels, and were expressed as a percentage of the luciferase message 

from the pGL3-control constructs bearing no TS-3'UTR sequences. Compared to the 

empty pGL3 -control construct, a significant decrease in luciferase mRNA levels was 

observed with most TS-3'UTR bearing constructs (Fig. 7B, white bars). However, no 

significant decreases in message levels were observed between the full-length (1-495) 

TS-3'UTR construct and each of its deletion constructs. These results correlate with 

the changes in luciferase activity seen above, and indicate that the decreases in 

luciferase activity caused by insertion of the TS-3'UTR regions into the highly stable 
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luciferase 3'UTR were mainly due to altered mRNA stability. In addition, since no 
significant changes in luciferase mRNA levels or luciferase activity were seen 
between full-length and deletion constructs of the TS-3'UTR, these results indicate 
that there are no major mRNA instability or translational silencing elements within the 
TS-3'UTR. 

2. TS-3'UTR Constructs Bearing the 6 bp/1494 Deletion 
Polymorphism Have Decreased Luciferase Activity and 
mRNA Levels Compared to TS-3'UTR Constructs 
Containing the 6 bp. 

[001 00] To determine the effects of the -6 bp/ 1494 deletion on TS-3 'UTR 
regulation, a series of constructs bearing the deletion polymorphism Were made. Since 
the polymorphism lies in the far distal region of the 3'UTR (Fig. 8A, indicated by gaps 
at nucleotide 456 of 495), it was only necessary to create constructs that were either 
full-length or serial deletions from the proximal end of the 3'UTR. 
[00101] 293 cells were transiently transfected with the -6 bp/1494 constructs 
and incubated overnight. Cells were harvested and lysates were assayed for either 
luc*iferase activity or luciferase mRNA levels as above. The full-length -6 bp/1494 (1- 
489) construct had -35% less luciferase activity (p < 0.05) compared to its +6 bp/1494 
counterpart (1-495) (Fig. 8B, black bars). This decrease in luciferase activity 
correlated with a similar decrease in mRNA levels (Fig. 8B, white bars), suggesting 
that the changes in luciferase activity between +6 bp and -6 bp constructs were due to 
changes in mRNA stability and not translational silencing. Significant differences in 
luciferase activity and mRNA levels between +6 bp and -6 bp constructs were also 
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observed for the 300-495 vs. 300-489 constructs and the 400-495 vs. 400-489 
constructs. 

[00102] These results further prove that the -6 bp/1494 constructs had 
significantly decreased mRNA stability, compared with TS-3'UTR constructs bearing 
the +6 bp. There was no evidence for increased translational repression from the -6 
bp/1494 constructs as seen by the correlation between decreases in luciferase activity 
and mRNA levels. 

3. The -6 bp/1494 Deletion in the TS-3'UTR Causes Decreased 
Message Stability. 

[00103] In order to support the theory that the decreases in luciferase protein and 
mRNA levels were due to decreased message stability, an mRNA decay assay was 
performed. By treating cells with actinomycin D, which inhibits new transcription, 
one can measure the relative half-life, or rate of mRNA decay, of a given transcript. 
Cells were transfected with either +6 bp or -6 bp/1494 TS-3'UTR constructs, and 
were treated with actinomycin D in order to inhibit new transcription. Cells were 
harvested every 2 hours post-treatment, for 6 hours, and total RNA was obtained and 
used for RT-PCR analysis of luciferase mRNA levels. Results are displayed as the 
percentage of mRNA remaining at time zero. 

[00104] The full-length (1-495) TS-3'UTR construct had 93% mRNA remaining 

after 6 hours, compared with 70% for pGL3-control (Fig. 9), showing that the TS- 

3'UTR is highly stable after 6 hours. The -6 bp/1494 (1-489) TS 3'UTR construct 

had 45% less mRNA remaining after 6 hours compared to its +6 bp/1494 counterpart 

(p < 0.05). The 300-495 (+6 bp) and 300-489 (-6 bp) constructs were also utilized for 
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this assay since they displayed the most robust differences in luciferase activity and 
mRNA levels. The 300-489 (-6 bp) construct had 38% less mRNA remaining after 6 
hours compared with its wild-type 300-495 (+6 bp) counterpart (p < 0.05). These 
results confirm that the decreases in luciferase mRNA levels in TS-3'UTR constructs 
bearing the -6 bp/ 1494 deletion polymorphism were due to an increased rate of 
mRNA degradation and not due to alterations in translational efficiency. 

4. The -6 bp/1494 Deletion Polymorphism is Associated with 
Intratumoral TS mRNA Levels. 

[00105] Since the in vitro data proved that the -6 bp/1494 polymorphism caused 
decreased mRNA stability, it was next determined whether this polymorphism was 
associated with low TS mRNA expression in vivo. Intratumoral TS mRNA expression 
was measured in 43 individuals with advanced colorectal carcinoma by real-time 
Taqman RT-PCR and normalized to p actin mRNA. In order to correlate TS gene 
expression with the 6 bp/1494 polymorphism, these individuals were screened for the 
polymorphism by RFLP analysis from genomic DNA taken from whole blood as 
previously described (Ulrich, 2000). 

[00106] The distribution of genotypes for the polymorphism (Table 3) were 30% 

homozygous for the insertion (+6 bp/+6 bp), 56% heterozygous for the deletion 

polymorphism (+6 bp/-6 bp), and 14% homozygous for the deletion polymorphism (-6 

bp/-6 bp). The geometric mean of TS mRNA expression (Table 3) was the highest 

(1 1 .35) in individuals that were homozygous for the insertion (+6 bp/+6 bp), the 

lowest (2.71) in individuals homozygous for the deletion polymorphism (-6 bp/-6 bp). 

The value for TS mean expression fell in between the two extremes (5.42) in 
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individuals that were heterozygous for the polymorphism (+6 bp/ -6 bp). The 
comparison of the association between genotype (+6 bp/+6 bp vs. -6 bp/-6 bp) and TS 
mRNA levels was statistically significant (p = 0.007). In addition, the overall 
comparison between genotypes and TS mRNA levels was statistically significant (p = 
0.017). 

Table 3. Measurement of TS mRNA levels in metastatic tumor tissue and 
distribution of the -6 bp/1494 deletion polymorphism among 43 Caucasian 
individuals with colorectal cancer. 



TS n Genotype TS 2 95% CI 3 

mRNA 

Genotype (%) Mean 



+6bp/+6bp 13 30 
+6bp/-6bp 24 56 
-6bp/-6bp 6 14 



11.35 (6.43,20.03) 
5.42 (3.57, 8.24) 
2.71 (1.18, 6.26) 



Comparison of TS Mean 

Genotype jr?-value 4 
(+6/+6) vs. (-6/-6) 0.007 
(+6/+6) vs. (+6/-6) 0.041 
(+6/-6) vs. (-6/-6) 0.14 
Overall 0.017 



total number of individuals in sample population. 

TS mean = geometric mean of mRNA expression of TS relative to p actin mRNA. 
95% confidence interval. 

p-value for the overall comparison is based on the F-test, all other p-values are based 
on the LSD (least significant difference) test. 

[00107] These findings are consistent with our in vitro data which indicates that 
the -6 bp/1494 polymorphism is associated with decreased mRNA stability, and 
provides further in vivo evidence of this association. 
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5. The -6 bp/1494 Deletion Polymorphism Varies Greatly 
Among Different Ethnic Populations. 

[00108] Using the RFLP analysis cited above, the frequency of the -6 bp/1494 
deletion polymorphism was assessed in non-Hispanic whites, Hispanic whites, and 
African- Americans in Los Angeles, California (Table 4). 



Table 4. Distribution of the 6 bp/1494 deletion polymorphism among non- 
Hispanic white, Hispanic white, African-American, and Singapore Chinese 
individuals. 



Ethnic 




Genotype (%) 


Allele frequency (%) 


Group 


i 

n 


+6 bp/+6 bp 
bp 


+6 bp/-6 bp 


-6 bp/-6 


+6 bp 


-6 bp 


White 


63 


40 


38 


22 


59 


41 


Hispanic 


98 


58 


33 


9 


74 


26 


Afr. 


59 


25 


46 


29 


48 


52 


Amer. 


80 


50 


49 


1 


74 


26 


Chinese 















• total number of individuals in sample population 

[00109] The genotype frequencies of the polymorphism in non-Hispanic whites 
were 40% homozygous (+6 /+6) 5 38% heterozygous (+6/-6) and 22% homozygous for 
the deletion polymorphism (-6 /-6). The distribution of this polymorphism is 
consistent with previous reports in Caucasians (Ulrich, 2000; Kumagai, 2003). The 
genotype frequencies of the polymorphism were 58% (+6 /+6), 33% (+6 1-6) and 9% 
(-6 1-6) in Hispanic whites, and 25% (+6 /+6), 46% (+6 1-6) and 29% (-6 /-6) in 
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African- Americans. There was a statistically significant difference in genotype 
frequencies across the three racial-ethnic groups (pO.OOOl). When compared by 
genotype distribution between racial-ethnic groups, two at a time, except for non- 
Hispanic whites versus African-Americans, all other pair-wise comparisons yielded 
statistically significant differences. 

III. MATERIALS AND METHODS 

Expression, Purification and Phosphorylation of Recombinant USF- 
1 and Expression of USF-2 

[00110] cDNA encoding USF-1 (Gregor et al., 1990) was amplified from 34Lu 
human lung fibroblast cDNA. The upper primer was 5'- 

CGGGATCCATGAAGGGGCAGCAGAAAACAG-3 ' [SEQ ID NO: 2] and lower 

primer was 5 '-GCTCTAG ATTAGTTGCTGTCATTCTTGATGACGA-3 ' [SEQ ID 

NO: 3], adds BamHl and Xbal restriction sites respectively. PCR was carried out 

under the following conditions using Accu2yme DNA Polymerase (Bioline): 30 cycles 

for 30 s at 94 °C, 30 s at 59.3 °C, and 45 s at 72 °C. The product was digested with 

BamHl and Xbal and cloned in-frame into the pProEX-HTb vector (Invitrogen) that 

adds a 6-histidine tag to the N-terminus of the expressed protein. The plasmid was 

transformed into the DH5a strain of E. coli (Invitrogen) and protein expression was 

induced by adding IPTG to a final concentration of 0.6 mM to the culture. After 

induction, the cells were centrifuged at 10,000 x g for 10 min and resuspended in 4 

volumes of lysis buffer (20 mM Tris-HCl, pH 8.5 at 4 °C, 1 00 mM KC1, 5 mM 2- 

mercaptoethanol, 1 mM PMSF). Cells were lysed in a French press and cell debris 

was removed by centrifugation. Supernatant was run on a Ni-NTA resin column 
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following the pProEX-HT Prokaryotic Expression System protocol (Invitrogen) to 
isolate recombinant 6-histidine tagged USF-I . 

[00111] To activate the DNA binding ability of USF- 1 , the recombinant protein 
was phosphorylated in vitro using cdc2/p34. Cdc2/p34 was isolated by 
immunoprecipitation using mouse monoclonal antibodies (sc-54, Santa Cruz 
Biotechnologies). An in vifro phosphorylation reaction was carried out by adding 6 pi 
of 5X-cdc2 kinase buffer (1 M Tris-HCl, pH 7.5, 1 M MgCl 2 , and 1 M dithiothreitol), 
1 pi of 1 mM ATP/1 mM MgCLj (1 mM[y- 32 P]ATP/l mM MgCl, for visualization of 
phosphorylation), 200 ng of recombinant USF-1, and 8 pi H,0 to 15 pi of the protein 
A sepharose beads bound with cdc2/p34. The reaction was carried out for 20 min at 
v 30 °C, then loaded and run on a 12.5% SDS-PAGE gel. The gel was dried and placed 
in a cartridge with Kodak Biomax Maximum Sensitivity film for visualization of [y- 
32 P] ATP incorporation. 

[00112] USF-2 cDNA was amplified from 34Lu cDNA (upper primer 5'- 
CCGGAATTCCATGCCATGGACATGCTGGACCC-3' [SEQ ID NO: 4] and lower 
primer5'-GCTCTAGACATGTGTCCCTCTCTGTGCTAAGG -3' [SEQ ID NO: 5], 
adds EcoRl and Xbal restriction sites respectively) and PCR was carried out under the 
following conditions using Accuzyme DNA Polymerase (Bioline, Denville Scientific): 
30 cycles for 30 s at 94 °C, 30 s at 62 °C, and 45 s at 72 °C. The USF-1 and USF-2 
cDNAs were cloned into the pCI-neo plasmid vector (Promega) for expression in 
transient transfection experiments. 
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Electrophoretic Mobility Shift Assay (EMS A) 
[00113] Synthetic double-stranded oligonucleotides (Integrated DNA 
Technologies) corresponding to a wild type (R) or variant (RV) 28 bp tandem repeat 
sequence from the TS 5' regulatory region were labeled with [y- 32 P] ATP (Amersham 
Pharmacia Biotech) according to the Gel-Shift Assay Kit protocol (Promega). For 
each gel shift reaction, 10,000 cpm of labeled probe -30 ng of recombinant USF-1 for 
20 min at room temperature in a 20 ul reaction mixture containing 10 mM Tris-HCl, 
pH 7.5, 50 mM NaCl, 0.5 mM dithiothreitol, 0.5 mM EDTA, 4% glycerol, 1 mM 
MgCL and 0.1 ug of poly(dldC) DNA. Where indicated, unlabeled competitor 
oligonucleotides were incubated for 10 min at room temperature with nuclear extracts 
prior to addition of labeled probe. 

[00114] Reactions using recombinant USF-I contained -30 ng of either USF-1 or 
phospho USF-1. Samples were loaded onto a non-denaturing 4% acrylamide gel and 
electrophoresed in 0.5 x TBE buffer at 350 V at 4 °C. The gels were dried and 
visualized by autoradiography using Kodak BioMax Maximum Resolution Film. 
Sequences of the oligonucleotides were as follows: TS wild-type repeat (R), 5'- 
CCGCGCCACTTGGCCTGCCTCCGTCCCG-3' [SEQ ID NO: 6]; TS variant repeat 
(RV), 5'-CCGCGCCACTTcGCCTGCCTCCGTCCCG-3' [SEQ ID NO: 7]; USF-1 
specific competitor oligonucleotide (Santa Cruz Biotechnology), 5'- 
CACCCGGTCACGTGGCCTACACC-3' [SEQ ID NO: 8]; USF-1 mutant competitor 
oligonucleotide (Santa Cruz Biotechnology), 5'- 

C ACCCGGTC AATTGGCCTAC ACC-3 ' [SEQ ID NO: 9]; poly (dldC) (Sigma) used 
as non-specific competitor. 
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Chromatin Immunoprecipitation Assay (ChIP) 
[00115] Chromatin immttaoprecipitation from 293 cells was carried out using the 
ChIP assay kit (Upstate Biotechnologies) according to the manufacturer's protocol. 
Briefly, 1 x 10 6 cells were plated in 10 cm dishes and incubated overnight at 37 °C. 
The cross-linking of protein to DNA was carried out by adding 37% formaldehyde to 
the growth medium at a final concentration of 1%. The cross-linking reaction was 
performed for 10 min at 37 °C. Cells were washed in ice-cold PBS containing 
protease inhibitors (Protease inhibitor cocktail set III, Calbiochem) and scraped into 
conical screw-cap tubes. Cells were centrifuged and resuspended in SDS lysis buffer, 
then sonicated 3 times for 10 seconds at full power on ice, using a Branson 450 
sonifier, to shear DNA to 200-1,000 bp fragments. Samples were centrifuged and 200 
jil of sonicated cell supernatant was diluted into 1 ,800 jul of ChIP dilution buffer for 
each protein of interest. 

[00116] Salmon sperm DNA bound to Protein A agarose was added and spun 
down to remove non-specific background. The rabbit polyconal immunoprecipitating 
antibodies (USF-1, sc-229x; USF-2, sc-861x, Santa Cruz Biotechnologies) were added 
to each tube and incubated overnight at 4 °C with rotation. Salmon sperm 
DNA/Protein A agarose was added for 1 hr at 4 °C and pelleted to isolate the 
antibody/protein/histone/DNA complexes. The protein-DNA complexes were washed 
and eluted, and the cross-linking was reversed by heating samples at 65 °C for 4 hours. 
DNA was recovered by phenol/chloroform extraction and ethanol precipitation. PCR 
was carried out using the same primers and conditions as in the RFLP protocol. 
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Construction of Reporter Plasmids 
[00117] The TS promoter, located in the genomic sequence upstream of the 5'- 
exon of the gene, was identified and isolated. Primers were designed at -313 and 
+195 relative to transcription start and the PCR reaction yielded a 508 bp product for 
the 3R genotype and a 480 bp product for the 2R genotype. In order to isolate 3RV 
DNA, PCR amplification was performed from a random population of human genomic 
DNA and products were sequenced directly (Davis Sequencing). Fragments were 
cloned into the promoter-less pGL3 -Basic luciferase reporter gene vector (Promega) at 
SstI and XJtol sites just upstream of luciferase gene transcription start. Site-directed 
mutagenesis was carried out according to the manufacturer's protocol (Promega) to 
alter the USF-1 E-box consensus elements within the first 28 bp tandem repeat of both 
the 2R and 3R constructs. The mutagenic oligonucleotide primer sequence was 5'- 
GTCCTGCCACCGCGC^CTTGGCCTGCC-3' [SEQ ID NO: 10] (Integrated DNA 
Technologies) and yielded the 2RmutUSF and 3RmutUSF reporter constructs. All 
plasmid DNA was isolated and purified using Qiagen mini- and midi-prep kits. 

Cell Culture and Transient Transfections 
[00118] Human embryonic kidney 293 cells (American Type Culture Collection) 
were plated in 6-well dishes at a density of 5 x 10 5 cells/well and incubated overnight 
in 2.5 ml of DMEM medium supplemented with 5% (v/v) fetal bovine serum, 100 
units/ml penicillin, 100 ug/ml streptomycin, 10 mM pyruvate and 2 mM L-glutamine. 
The next day, growth media was aspirated from the cells and replaced with 2.5 ml of 
serum-free Opti-MEM medium (Invitrogen). A total of 5 ug of plasmid DNA (1 ug of 
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pCMV-p-galactosidase (Invitrogen) for standardization of transfection efficiencies, 1 
|ng of USF-l/pCI-neo or pCI-neo, and 3 [ig of reporter construct) was diluted into 250 
jil of Opti-MEM. A solution containing 250 |il of Opti-MEM and 15 jllI of 
Lipofectamine 2000 reagent (Invitrogen) was incubated for 5 min at room temperature 
and mixed with the DNA containing solution from the previous step. After a twenty- 
minute incubation at room temperature, the DNA-Lipofectamine solution was added 
drop-wise to the 293 cells in a circular fashion and cells were incubated for 2 hr at 37 
°C. The solution was aspirated and replaced with 3 ml of growth medium and cells 
were incubated overnight at 37 °C to allow gene expression. 

Cell Transfection 

[00119] 293 cells were transfected with 3 (ig of luciferase reporter 

construct containing either no promoter (pGL3-Basic), the TS 5' region containing two 
tandem repeats (2R), the TS 5' region containing three tandem repeats (3R), the TS 5' 
region containing the 3R with a G— >C SNP at the 12 th nucleotide of the third repeat, or 
the TS 5' region containing two or three tandem repeats with mutated E-box sites 
(2RmutUSF and 3RmutUSF). Cells were co-transfected with 1 \ig of empty pCI-NEO 
vector, 1 jig of vector containing USF-1 cDNA, 1 jLtg of vector containing USF-2 
cDNA, or 0.5 ^ig of USF-1 and USF-2 containing vectors, respectively. Cells were 
also co-transfected with 1 \xg of the pCMV-p-galactosidase vector for standardization 
of transfection efficiencies. Twenty-four hours after transfection, cells were 
harvested, lysed, and assayed for (3-galactosidase activity and luciferase activity. 
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Results from these experiments show that there was an increase in relative luciferase 
activity from both the 2R and 3R constructs in the presence of USF-1. 

Luciferase Assay 

[00120] Luciferase activity was determined using a luciferase assay system 
(Promega) following the manufacturer's protocol. Briefly, cells were scraped into 
lysis reagent, transferred to microfuge tubes and centrifuged for 30 s at 12,000 x g. 
Luciferase activity was measured using a manual luminometer (Turner Design, TD 
20/20) by mixing 100 ul of luciferase assay reagent with 20 ul of 1:10 diluted cell 
lysate and reading three times at 10 sec intervals for each sample. Transfection 
efficiencies were obtained using a p-galactosidase assay (Promega) of cell lysates by 
reading the absorbance at 420 nm. Relative luciferase activity was quantified by 
standardizing luciferase activity to a transfection efficiency factor. 

Genotyping of 2R, 3R and 3RV by Restriction Fragment Length 
Polymorphism Analysis 

[00121J Genomic DNA was isolated from 100 colorectal cancer patients from 
200 ul of whole blood using the QiaAmp kit (Qiagen, Valencia, CA). To isolate the 
region of DNA containing the tandem repeats, PCR primers were designed at +15 and 
+195 relative to transcription start. The upper primer sequence was 5'- 
CGAGCAGGAAGAGGCGGAG-3 ' [SEQ ID NO: 1 1] and the lower primer sequence 
was 5'-TCCGAGCCGGCCACAGGCAT-3' [SEQ ID NO: 12]. 35 cycles of PCR 
were carried out for 30s at 94 °C, 30 s at 60 °C, and 1 m at 72 °C. 15 ul of the PCR 
reaction was digested with HaelR restriction enzyme in a 20 ul reaction volume. The 
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digested and undigested PCR products from each patient were loaded into adjacent 
lanes on a 3% sea plaque agarose (BioWhittaker Molecular Applications) gel 
containing ethidium bromide (0.5 mg/ml) and electrophoresed in 0.5 x TBE. 
Genotyping was performed twice for all samples by independent investigators. 

Patient Selection 

[00122] The presence of the new SNP within the TS gene was confirmed in Non- 
Hispanic Whites. Individuals (disease-free controls) were initially recruited for a 
cancer case-control study in California as described in Castelao et al. 5 2001. Patients 
included in this study has metastatic colorectal cancer and were enrolled in the 
■^t: following protocols: Southwest Oncology Group protocol 9420 (19 patients, 5-FU 
doses used: CI (continuous infusion) 300 mg/m7d v. CI 2600 rag/m 2 /d q weekly) 
opened for accrual in May 1995 and closed in May 1999; and the University of 
Southern California protocol: 3C-92-2 (21 patients, 5-FU doses used: CI 200 mg/mVd 
q weekly for three weeks followed by one week rest) opened for accrual in September 
1992 and closed in June 1995. All patients signed an informed consent to participate 
in the clinical trial and for evaluation of the TS polymorphism. Genotyping for the TS 
polymorphism was performed on paraffin-embedded tissues in all patients. 
[00123] All patients had bi-dimensionally measurable disease at the time of 
protocol entry. Responders to therapy were classified as those patients whose tumor 
burden (the sum, overall measurable lesions of the products of the largest diameter and 
its perpendicular diameter) decreased by 50% or more for at least six weeks. 
Progressive disease was defined as 25% or more increase in tumor burden (compared 
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to the smallest measurement) or the appearance of new lesions. Patients that did not 
experience a response and did not progress within the first 12 weeks following the 
start of 5-FU/luecovorin were classified as having stable disease. 
[00124] Survival was computed as the number of months from the initiation of 
chemotherapy with 5-FU to death of any cause. Patients who were alive at the last 
follow-up evaluation were censored at that time. 

Statistical Analysis 

[00125] Contingency tables and Fisher's exact test (Metha et al., 1983) were used 
to summarize the association of response (grouped as response, stable disease, and 
progressive disease) to 5-FU with the TS genotypes. Kaplan-Meier plots (Kaplan et 
al., 1958) and the log-rank test (Miller et al., 1981) were used to compare survival of 
patients according to TS genotypes. Median survival was calculated based on the 
Kaplan-Meier estimator. All p-values are two-sided. 

Preparation and Study of Six Base Pair Deletion in the 3' UTR Cell 
Culture 

[00126] The 293 human embryonic kidney (HEK) cell line was obtained from 
ATCC and cells were cultured in Dulbecco's modified Eagle medium supplemented 
with 5% (v/v) fetal bovine serum, 100 units/ml penicillin, 100 Ug/ml streptomycin, 10 
mM pyruvate and 2 mM L-glutamine. Cells for all experiments were confluent, and 
used within 10 passages of the original stock supplied by ATCC. 
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Reporter Gene Construction 
[00127] Various regions of the thymidylate synthase gene encoding the 3'UTR 
were amplified from genomic DNA by PCR using primers terminating in an Spel 
recognition sequence which produces compatible ends WiihXbal digested fragments. 
Products containing the +6bp/1494 polymorphism and the -6bp/1494 polymorphism 
were obtained through PCR by pre-screening human genomic DNA for homozygous 
template samples for each polymorphism by RFLP analysis as previously described 
(Ulrich, 2000). DNA fragments were digested, purified by agarose gel electrophoresis 
and extracted using a DNA gel extraction kit (Millipore). PCR products were ligated 

into the pGL3-control vector (Promega) within the 3'-UTR of the firefly luciferase 
-gi- 
gene at a unique Xbal site. The orientation, sequence, and 1494 polymorphisms of all 

constructs were confirmed by sequencing (Davis Sequencing). 

Transient Transfections 
[00128] 293 cells were transiently transfected using the LipofectAMINE 2000 

transfection reagent (Invitrogen). Cells were plated in six-well plates at a density of 

lxlO 6 cells/well and incubated overnight. Transfections were carried out following 

the manufacturer's protocol (Invitrogen). 1 .5 |ug of reporter gene plasmid DNA, and 

0.5 jug of pCMV-p-galactosidase plasmid DNA (Invitrogen) for standardization, were 

mixed in 500 jul of serum-free medium with 4 \x\ of transfection reagent and incubated 

for 20 minutes at room temperature. The DNA-LipofectAMINE complex was added 

dropwise to each well and cells were incubated overnight for gene expression. Cells 
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were either treated with actinomycin D, or lysed directly in culture dishes for 
luciferase assays or mRNA quantitation, 24 hours post-transfection. 

Luciferase Assays 

[00129] Luciferase activity was determined using a luciferase assay system 
(Promega) following the manufacturer's protocol. 350 ul of cell culture lysis reagent 
was added to each well and cells were scraped and transferred to microfuge tubes. 
Cellular debris was removed by centrifugation for 2 minutes at 12,000 rpm. 
Supernatant was diluted 1 : 10 in cell culture lysis reagent and assayed for luciferase 
activity using a manual luminometer (TD 20/20, Turner Designs). 
[00130] Luciferase assays were performed by mixing 1 00 ul of luciferase assay 
reagent with 20 ul of diluted supernatant. Light output was measured over a ten- 
second time period in triplicate for each sample. Relative luciferase activity was 
calculated by averaging the readings and then normalizing to transfection efficiencies 
by measuring P-Galactosidase activity. Relative p-galactosidase activity was 
measured using an assay kit (Promega) and by determining the absorbance of samples 
at 420 nm. All luciferase values are expressed as the percentage of relative luciferase 
activity compared to pGL3 -control. 

Semi-Quantitative Reverse Transcriptase-PCR 
[00131] Total RNA was isolated from transfected cells using the RNeasy Mini 
Kit (Qiagen). Total RNA was treated with DNase I while on the mini-columns to 
eliminate amplification of reporter plasmid DNA and genomic DNA. Total RNA was 

quantified and normalized for amplification by RT-PCR using the One Step RT-PCR 
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Kit (Qiagen). cDNA was run on a 2% agarose gel and band intensity of .lucif erase and 
glyceraldehyde-3-phosphate (GAPDH) products was quantified by densitometry using 
Eagle Eye software (Stratagene). Luciferase amplification primers were 5'- 
GCCTGAAGTCTCTGATTAAGT-3 3 [SEQ ID NO: 13] for the forward primer and 
5 ' - AC ACCTGCGTCGAAGATGT-3 ' [SEQ ID NO: 14] for the reverse primer (97 bp 
product). 

[00132] Amplification primers for GAPDH were 

CCCCTGGCCAAGGTCATCCATGACAACTTT [SEQ ID NO: 15] for the forward 
primer and GGCCATGAGGTCCACCACCCTGTTGCTGTA [SEQ ID NO: 16] for 
the reverse primer (5 10 bp product). 1 5 pmol of each luciferase primer and 3 pmol of 
each GAPDH primer (internal control) were used in each reaction. The PGR 
conditions consisted of: Hot start at 50°C for 30 min for the RT reaction and 95°C for 
15 min followed by 25 cycles of 1 min at 94°C, 1 min at 58°C, 1 :30 min at 72°C, 
followed by 72°C for 10 min. The amount of luciferase message in each RNA sample 
was quantified and normalized to GAPDH content and is expressed as a percentage of 
luciferase cDNA compared to cells transfected with the pGL3 -Control vector. 

Reporter Gene mRNA Decay 
[001331 293 (HEK) cells were transiently transfected and incubated for 24 hours 
to allow for luciferase gene expression. Media was aspirated and replaced with media 
containing actinomycin D (10 p,g/ml) in order to inhibit new transcription. Total RNA 
was isolated at various times points after actinomycin D treatment, and luciferase 
mRNA content was determined by RT-PCR as described above. Luciferase mRNA 
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levels were normalized to GAPDH mRNA content and are expressed as a percentage 
of the mRNA level present at time zero. Data were plotted by linear regression 
analysis using the Prism program (Graph Pad, Inc.). 

Statistical Analysis 

[00134] All experiments were performed on three separate occasions, each in 
duplicate. Data are expressed as the means ± S.E. Comparison of means was 
performed using the Student's t test. 

Patient Selection, mRNA Quantitation, and Statistical Analysis 
[00135J The 43 patients in this study had advanced colorectal carcinoma and 
were previously untreated. All patients signed an informed consent for tissue 
collection and evaluation of determinants of 5-FU efficacy and toxicity. A PCR 
amplification and RFLP analysis was performed to identify the TS 6 bp/ 1494 
genotypes of each patient as previously described (Ulrich, 2000). TS mRNA was 
measured using a quantitative RT-PCR method as described in detail elsewhere 
(Horikoshi, 1992). 

Allelic Frequency Analysis 
[001361 TS genotype measurements were performed on 63 non-Hispanic white, 

98 Hispanic white, and 59 African- American subjects in Los Angeles, California and 
on 80 Chinese subjects in Singapore using an RFLP based analysis as previously 
described (Ulrich, 2000). The 63 non-Hispanic white subjects represented a random 
sample of the 691 white controls from a recently completed population-based case- 
control study of bladder cancer in Los Angeles County (Castelao, 2001). The 59 
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African- American (34 bladder cancer cases plus 25 controls) and 98 Hispanic (50 
bladder cancer cases plus 48 controls) subjects also were participants of this Los 
Angeles Bladder Cancer Study (Castelao, 2001). Among African- American or 
Hispanic white subjects, there was no statistically significant difference in genotypic 
distributions between bladder cancer cases and controls. Therefore, frequencies were 
reported for all subjects combined within each race. The 80 Singapore Chinese 
subjects were a random sample of the 63,000 participants of the Singapore Chinese 
Health Study, an ongoing prospective cohort study focusing on diet and cancer 
development (Seow, 2002). The chi-square test was used to examine possible 
differences in genotype distributions by race. All ^-values quoted are two-sided, p- 
values less than 0.05 are considered statistically significant. 
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CLAIMS 

We claim: 

1 . An isolated nucleic acid molecule of SEQ ID NO: 1 , wherein G is replaced by C 
at nucleotide 12. 

2. The isolated nucleic acid molecule of claim 1 and an isolated nucleic acid 
molecule of SEQ ID NO:l, wherein the two isolated nucleic acid molecules are forms 
of a single nucleotide polymorphism in the 5' region of a thymidylate synthase (TS) 
gene. 

3. A single-stranded nucleic acid probe that hybridizes to the isolated nucleic acid 
molecule of claim 1, but not to SEQ ID NO:l. 

4. The probe of claim 3, wherein the nucleic acid is DNA. 

5. The probe of claim 3, wherein the probe is detectably labeled. 

6. A diagnostic kit comprising the probe as defined by claim 3 5 and/or an allele- 
specific nucleic acid primer of 8-40 nucleotides specifically hybridizes to and detects 
the molecule of claim 1 5 and instructions for use. 

7. The diagnostic kit of claim 6, wherein the primer is of 12-35 nucleotides. 

8. The diagnostic kit of claim 6, wherein the primer is of 17-35 nucleotides. 

9. The diagnostic kit of claim 6, wherein hybridization indicates reduced 
transcriptional activity of the TS gene, and a corresponding decreased risk of 
developing a disease. 

10. The diagnostic kit of claim 6, wherein the disease is cancer or cardiovascular 
disease. 
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11. A method for determining whether an individual has or has a heightened 
predisposition to cancer or cardiovascular disease, comprising: 

(a) obtaining a sample from the individual comprising nucleic acid 
molecules containing a thymidylate synthase gene; and 

(b) detecting one or more polymorphisms in the TS gene, wherein 

(i) an individual with an 3R/3R construct in the 5' region of the TS 
gene more likely has or has a heightened predisposition as 
compared to an individual with a 3R/3RV, 2R/2R, 2R/3R, or 
2R/3RV construct; 

(ii) an individual with a +6 bp/1494 3' untranslated region 
polymorphism of the TS gene more likely has or has a heightened 
predisposition as compared to an individual with a -6 bp/ 1494 3 5 
untranslated region polymorphism of the TS gene; 

(iii) an individual with both the 3R/3R construct in the 5' region and a 
+6 bp/1494 3' untranslated region polymorphism of the TS gene 
most likely has or has the highest probability of developing 
cancer or cardiovascular disease (CVD). 

12. The method of claim 1 1 , wherein an individual with the 3R/3R construct in the 
5' region of the TS gene has two active USF consensus sequences in each 3R portion, 
resulting in greater transcriptional activity as compared to a subject with one active 
USF sequence in either a 2R construct or a variable 3RV construct. 

1 3 . The method of claim 1 1 , wherein the detecting step comprises amplifying the 

portion of the nucleic acid molecule comprising the TS gene. 
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14. The method of claim 13, wherein the amplifying uses the method of 
polymerase chain reaction. 

15. The method of claim 1 1 , wherein the determining step comprises sequencing 
the portion of the nucleic acid molecule comprising the TS gene. 

1 6. The method of claim 1 1 , wherein the determining step comprises the use of 
high throughput screening. 

17. The method of claim 1 1, wherein a 3R construct comprises SEQ ID NO: 1 and a 
3RV construct comprises SEQ ED NO: 1 , wherein at position 12, G is replaced by C. 

18. The method of claim 1 7, wherein the replacement of G by C at position 12 is 
associated with the efficacy of a chemotherapeutic or anti-CVD drug, and wherein if 
the replacement of G by C at position 12 has occurred, the chemotherapeutic or anti- 
CVD drug is more efficacious than if the substitution had not occurred.. 

1 9. The method of claim 1 1 , wherein the TS gene is derived from bodily fluid of 
the subject. 

20. The method of claim 19, wherein the bodily fluid is blood. 
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SEQUENCE LISTING 



<110> 



University of Medicine and Dentistry of New Jersey 
University of Southern California 



<120> THYMI DYLATE SYNTHASE POLYMORPHISMS FOR USE IN SCREENING FOR 
CANCER SUSCEPTIBILITY 

<13 0> 547 04 .8060. WO00 

<150> 60/420,164 

<151> 2002-10-21 

<160> 16 

<170> Patentln version 3.2 

<210> 1 

<211> 28 

<212> DNA 

<213> Homo sapiens 



<210> 2 

<211> 30 

<212> DNA 

<213> Homo sapiens 

<400> 2 

cgggatccat gaaggggcag cagaaaacag 3 0 



<210> 3 

<211> 34 

<212> DNA 

<213> Homo sapiens 



<210> 4 

<211> 32 

<212> DNA 

<213> Homo sapiens 

<400> 4 

ccggaattcc atgccatgga catgctggac cc 32 



<210> 5 

<211> 32 

<212> DNA 

<213> Homo sapiens 



<400> 1 

ccgcgccact tggcctgcct ccgtcccg 



28 



<400> 3 

gctctagatt agttgctgtc attcttgatg acga 



34 



<400> 5 

gctctagaca tgtgtccctc tctgtgctaa gg 



32 



1 
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<210> 6 

<211> 28 

<212> DNA 

<213> Homo sapiens 

<400> 6 

ccgcgccact tggcctgcct ccgtcccg 28 

<210> 7 

<211> 28 

<212> DNA 

<213> Homo sapiens 



<210> 8 

<211> 23 

<212> DNA 

<213> Homo sapiens 

<400> 8 

cacccggtca cgtggcctac acc 23 

<210> 9 

<211> 23 

<212> DNA 

<213> Homo sapiens 



<210> 10 

<211> 28 

<212> DNA 

<213> Homo sapiens 

<400> 10 

gtcctgccac cgcgcgtctt ggcctgcc 28 

<210> 11 

<211> 19 

<212> DNA 

<213> Homo sapiens 



<400> 7 

ccgcgccact tcgcctgcct ccgtcccg 



28 



<400> 9 

cacccggtca attggcctac acc 



23 



<400> 11 

cgagcaggaa gaggcggag 



19 



<210> 
<211> 
<212> 
<213> 



12 
20 
DNA 



Homo sapiens 
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<400> 12 20 
tccgagccgg ccacaggcat 

<210> 13 

<211> 21 

<212> DNA 

<213> Homo sapiens 

<400> 13 21 
gcctgaagtc tctgattaag t 

<210> 14 

<211> 19 

<212> DNA 

<213> Homo sapiens 

<400> 14 19 
acacctgcgt cgaagatgt 



<210> 15 

<211> 30 

<212> DNA 

<213> Homo sapiens 

<400> 15 30 
cccctggcca aggtcatcca tgacaacttt 



<210> 16 

<211> 30 

<212> DNA 

<213> Homo sapiens 

<400> 16 

ggccatgagg tccaccaccc 



tgttgctgta 
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